四、统计特征

1.平均值、中值

mean：平均数
median：中位数
nanmedian：忽略NaN的中位数
geomean：几何平均数
harmmean：调和平均数

>> A = magic(5)

A =

    17    24     1     8    15
    23     5     7    14    16
     4     6    13    20    22
    10    12    19    21     3
    11    18    25     2     9

>> M1 = mean(A)

M1 =

    13    13    13    13    13

>> M2 = median(A)

M2 =

    11    12    13    14    15

>> M3 = nanmedian(A)

M3 =

    11    12    13    14    15

>> M4 = geomean(A)

M4 =

   11.1462   10.9234    8.4557    9.8787   10.7349

>> M5 = harmmean(A)

M5 =

    9.2045    9.1371    3.8098    6.2969    8.0767

>> A = magic(5)

A =

17 24 1 8 15

23 5 7 14 16

4 6 13 20 22

10 12 19 21 3

11 18 25 2 9

>> M1 = mean(A)

M1 =

13 13 13 13 13

>> M2 = median(A)

M2 =

11 12 13 14 15

>> M3 = nanmedian(A)

M3 =

11 12 13 14 15

>> M4 = geomean(A)

M4 =

11.1462 10.9234 8.4557 9.8787 10.7349

>> M5 = harmmean(A)

M5 =

9.2045 9.1371 3.8098 6.2969 8.0767

2.数据比较

sort：普通排序
sortrows：按行排序
range：求解值域大小

>> A = rand(5)

A =

    0.8147    0.0975    0.1576    0.1419    0.6557
    0.9058    0.2785    0.9706    0.4218    0.0357
    0.1270    0.5469    0.9572    0.9157    0.8491
    0.9134    0.9575    0.4854    0.7922    0.9340
    0.6324    0.9649    0.8003    0.9595    0.6787

>> S1 = sort(A)

S1 =

    0.1270    0.0975    0.1576    0.1419    0.0357
    0.6324    0.2785    0.4854    0.4218    0.6557
    0.8147    0.5469    0.8003    0.7922    0.6787
    0.9058    0.9575    0.9572    0.9157    0.8491
    0.9134    0.9649    0.9706    0.9595    0.9340

>> S2 = sortrows(A)

S2 =

    0.1270    0.5469    0.9572    0.9157    0.8491
    0.6324    0.9649    0.8003    0.9595    0.6787
    0.8147    0.0975    0.1576    0.1419    0.6557
    0.9058    0.2785    0.9706    0.4218    0.0357
    0.9134    0.9575    0.4854    0.7922    0.9340

>> S3 = range(A)

S3 =

    0.7864    0.8673    0.8130    0.8176    0.8983

>> A = rand(5)

A =

0.8147 0.0975 0.1576 0.1419 0.6557

0.9058 0.2785 0.9706 0.4218 0.0357

0.1270 0.5469 0.9572 0.9157 0.8491

0.9134 0.9575 0.4854 0.7922 0.9340

0.6324 0.9649 0.8003 0.9595 0.6787

>> S1 = sort(A)

S1 =

0.1270 0.0975 0.1576 0.1419 0.0357

0.6324 0.2785 0.4854 0.4218 0.6557

0.8147 0.5469 0.8003 0.7922 0.6787

0.9058 0.9575 0.9572 0.9157 0.8491

0.9134 0.9649 0.9706 0.9595 0.9340

>> S2 = sortrows(A)

S2 =

0.1270 0.5469 0.9572 0.9157 0.8491

0.6324 0.9649 0.8003 0.9595 0.6787

0.8147 0.0975 0.1576 0.1419 0.6557

0.9058 0.2785 0.9706 0.4218 0.0357

0.9134 0.9575 0.4854 0.7922 0.9340

>> S3 = range(A)

S3 =

0.7864 0.8673 0.8130 0.8176 0.8983

3.方差（即期望 var）、标准差（std）

var：方差
std：标准差
skewness：三阶统计量斜度

>> x = randn(8,2)

x =

    1.0347   -0.8095
    0.7269   -2.9443
   -0.3034    1.4384
    0.2939    0.3252
   -0.7873   -0.7549
    0.8884    1.3703
   -1.1471   -1.7115
   -1.0689   -0.1022

>> dx = var(x)

dx =

    0.8040    2.2308

>> dx1 = var(x,1)

dx1 =

    0.7035    1.9519

>> s = std(x)

s =

    0.8967    1.4936

>> s1 = std(x,2)
错误使用 var (line 177)
W 必须为非负权重矢量，或者为标量 0 或 1。

出错 std (line 38)
y = sqrt(var(varargin{:}));
 
>> s1 = std(x,0)

s1 =

    0.8967    1.4936

>> s1 = std(x,1)

s1 =

    0.8388    1.3971

>> sk = skewness(x)

sk =

   -0.0554   -0.3088

>> sk = skewness(x,1)

sk =

   -0.0554   -0.3088

>> x = randn(8,2)

x =

1.0347 -0.8095

0.7269 -2.9443

-0.3034 1.4384

0.2939 0.3252

-0.7873 -0.7549

0.8884 1.3703

-1.1471 -1.7115

-1.0689 -0.1022

>> dx = var(x)

dx =

0.8040 2.2308

>> dx1 = var(x,1)

dx1 =

0.7035 1.9519

>> s = std(x)

s =

0.8967 1.4936

>> s1 = std(x,2)

错误使用 var (line 177)

W 必须为非负权重矢量，或者为标量 0 或 1。

出错 std (line 38)

y = sqrt(var(varargin{:}));

>> s1 = std(x,0)

s1 =

0.8967 1.4936

>> s1 = std(x,1)

s1 =

0.8388 1.3971

>> sk = skewness(x)

sk =

-0.0554 -0.3088

>> sk = skewness(x,1)

sk =

-0.0554 -0.3088

4.协方差（cov）与相关系数（corrcoef）

>> x = ones(1,5)

x =

     1     1     1     1     1

>> r = rand(5,1)

r =

    0.3816
    0.7655
    0.7952
    0.1869
    0.4898

>> X = ones(5,5)

X =

     1     1     1     1     1
     1     1     1     1     1
     1     1     1     1     1
     1     1     1     1     1
     1     1     1     1     1

>> A = magic(5)

A =

    17    24     1     8    15
    23     5     7    14    16
     4     6    13    20    22
    10    12    19    21     3
    11    18    25     2     9

>> C1 = cov(x)

C1 =

     0

>> C2 = cov(r)

C2 =

    0.0667

>> C3 = cov(x,r)

C3 =

         0         0
         0    0.0667

>> C4 = cov(r,x)

C4 =

    0.0667         0
         0         0

>> C5 = cov(X)

C5 =

     0     0     0     0     0
     0     0     0     0     0
     0     0     0     0     0
     0     0     0     0     0
     0     0     0     0     0

>> C6 = cov(A)

C6 =

   52.5000    5.0000  -37.5000  -18.7500   -1.2500
    5.0000   65.0000   -7.5000  -43.7500  -18.7500
  -37.5000   -7.5000   90.0000   -7.5000  -37.5000
  -18.7500  -43.7500   -7.5000   65.0000    5.0000
   -1.2500  -18.7500  -37.5000    5.0000   52.5000

>> C7 = corrcoef(x,r)

C7 =

   NaN   NaN
   NaN     1

>> C8 = corrcoef(A,X)

C8 =

     1   NaN
   NaN   NaN

>> C9 = corrcoef(A)

C9 =

    1.0000    0.0856   -0.5455   -0.3210   -0.0238
    0.0856    1.0000   -0.0981   -0.6731   -0.3210
   -0.5455   -0.0981    1.0000   -0.0981   -0.5455
   -0.3210   -0.6731   -0.0981    1.0000    0.0856

100

101

102

103

104

>> x = ones(1,5)

x =

1 1 1 1 1

>> r = rand(5,1)

r =

0.3816

0.7655

0.7952

0.1869

0.4898

>> X = ones(5,5)

X =

1 1 1 1 1

>> A = magic(5)

A =

17 24 1 8 15

23 5 7 14 16

4 6 13 20 22

10 12 19 21 3

11 18 25 2 9

>> C1 = cov(x)

C1 =

>> C2 = cov(r)

C2 =

0.0667

>> C3 = cov(x,r)

C3 =

0 0

0 0.0667

>> C4 = cov(r,x)

C4 =

0.0667 0

0 0

>> C5 = cov(X)

C5 =

0 0 0 0 0

>> C6 = cov(A)

C6 =

52.5000 5.0000 -37.5000 -18.7500 -1.2500

5.0000 65.0000 -7.5000 -43.7500 -18.7500

-37.5000 -7.5000 90.0000 -7.5000 -37.5000

-18.7500 -43.7500 -7.5000 65.0000 5.0000

-1.2500 -18.7500 -37.5000 5.0000 52.5000

>> C7 = corrcoef(x,r)

C7 =

NaN NaN

NaN 1

>> C8 = corrcoef(A,X)

C8 =

1 NaN

NaN NaN

>> C9 = corrcoef(A)

C9 =

1.0000 0.0856 -0.5455 -0.3210 -0.0238

0.0856 1.0000 -0.0981 -0.6731 -0.3210

-0.5455 -0.0981 1.0000 -0.0981 -0.5455

-0.3210 -0.6731 -0.0981 1.0000 0.0856

五、统计作图

1.正整数频率表（tabulate）

>> T = ceil(5*rand(1,10))

T =

     3     4     4     4     2     4     4     1     1     3

>> table = tabulate(T)

table =

     1     2    20
     2     1    10
     3     2    20
     4     5    50

>> T = ceil(5*rand(1,10))

T =

3 4 4 4 2 4 4 1 1 3

>> table = tabulate(T)

table =

1 2 20

2 1 10

3 2 20

4 5 50

其第一列为元素，第二列为出现次数，第三列为百分比

2.累计分布函数图形（cdfplot）

[h, (stats)] = cdfplot(x)

h 为表示曲线的句柄，x 为向量；stats 为样本的一些特征

>> y = evrnd(0,3,100,1);
>> cdfplot(y)
>> hold on
>> x = -20 : .1 ; 10;
>> f = evcdf(x,0,3);
>> plot(x,f,'m')

>> y = evrnd(0,3,100,1);

>> cdfplot(y)

>> hold on

>> x = -20 : .1 ; 10;

>> f = evcdf(x,0,3);

>> plot(x,f,'m')

3.最小二乘拟合直线（lsline）

lsline
h = lsline

h为拟合曲线的句柄，该语句可实现离散数据的最小二乘拟合

>> x = 1:10;
>> y1 = x + randn(1,10);
>> scatter(x,y1,25,'b','*')
>> hold on
>> y2 = 2*x + randn(1,10);
>> plot(x,y2,'mo')
>> lsline

>> x = 1:10;

>> y1 = x + randn(1,10);

>> scatter(x,y1,25,'b','*')

>> hold on

>> y2 = 2*x + randn(1,10);

>> plot(x,y2,'mo')

>> lsline

4.绘制正态分布概率图形（normplot）

h = normplot(X)

其中，若X为向量，则显示正态分布概率图形；若X为矩阵，则显示每一列的正态分布

>> x = normrnd(10,1,25,1);
>> normplot(x)                 %绘制向量对象
>> figure
>> normplot([x,1.5*x])         %绘制矩阵对象

>> x = normrnd(10,1,25,1);

>> normplot(x) %绘制向量对象

>> figure

>> normplot([x,1.5*x]) %绘制矩阵对象

5.样本数据的盒图（boxplot）

boxplot(X)
boxplot(X, G)
boxplot(axes, X, …)
boxplot(…, ‘name’, value)

X为待绘制的变量；G为附加变量；axes 为坐标轴句柄；name，value 为可设置属性的属性名和属性值

>> x = randn(100,25);
>> subplot(311),boxplot(x)
>> subplot(312),boxplot(x,'plotstyle','compact')
>> subplot(313),boxplot(x,'notch','on')

>> x = randn(100,25);

>> subplot(311),boxplot(x)

>> subplot(312),boxplot(x,'plotstyle','compact')

>> subplot(313),boxplot(x,'notch','on')

看不懂盒图是什么

6.绘制参考线

refline 绘制参考直线，reflcurve 绘制参考曲线

refline(m, b)
refline(coeffs)
refline
hline = refline(…)

m 为斜率、b 为截距；coeffs 为前面两个参数构成的向量；hline 为参考线句柄

>> x = 1 : 10;
>> y = x + randn(1,10);
>> scatter(x,y,25,'b','*')
>> lsline
>> mu = mean(y);
>> hline = refline([0 mu]);
>> set(hline, 'Color', 'r')

>> x = 1 : 10;

>> y = x + randn(1,10);

>> scatter(x,y,25,'b','*')

>> lsline

>> mu = mean(y);

>> hline = refline([0 mu]);

>> set(hline, 'Color', 'r')

reflcurve
reflcurve(p)
hcurve = reflcurve(…)

p 为多项式系数向量

>> p = [1 -2 -1 0];
>> t = 0 : .1 : 3;
>> y = polyval(p,t) + .5*randn(size(t));
>> plot(t,y,'ro')
>> h = refcurve(p);
>> set(h,'Color','r')
>> q = polyfit(t,y,3);
>> refcurve(q)

>> p = [1 -2 -1 0];

>> t = 0 : .1 : 3;

>> y = polyval(p,t) + .5*randn(size(t));

>> plot(t,y,'ro')

>> h = refcurve(p);

>> set(h,'Color','r')

>> q = polyfit(t,y,3);

>> refcurve(q)

7.样本概率图形（capaplot）

p = capaplot(data, specs)
[p, h] = capaplot(data, specs)

data 为样本数据，specs 用于指定范围，p表示在指定范围内的概率

该函数返回来自与估计分布的随机变量落在指定范围内的概率

>> data = normrnd(3,.005,100,1);
>> p1 = capaplot(data,[2.99 3.01])

p1 =

    0.9449

>> grid on; axis tight
>> figure
>> p2 = capaplot(data,[2.995 3.015])

p2 =

    0.8037

>> grid on; axis tight

>> data = normrnd(3,.005,100,1);

>> p1 = capaplot(data,[2.99 3.01])

p1 =

0.9449

>> grid on; axis tight

>> figure

>> p2 = capaplot(data,[2.995 3.015])

p2 =

0.8037

>> grid on; axis tight

8.正态拟合直方图（histfit）

histfit(data)
histfit(data, nbins)
histfit(data, nbins, dist)
h = histfit(…)

data 为向量；nbins 指定bar 的个数；dist为分布类型

>> r = normrnd(10,1,200,1);
>> histfit(r)
>> h = get(gca,'Children');
>> set(h(2),'FaceColor',[.8 .8 1])
>> figure
>> histfit(r,20)
>> h = get(gca,'Children');
>> set(h(2),'FaceColor',[.8 .8 1])