四、统计特征
1.平均值、中值
- mean:平均数
- median:中位数
- nanmedian:忽略NaN的中位数
- geomean:几何平均数
- harmmean:调和平均数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
>> A = magic(5) A = 17 24 1 8 15 23 5 7 14 16 4 6 13 20 22 10 12 19 21 3 11 18 25 2 9 >> M1 = mean(A) M1 = 13 13 13 13 13 >> M2 = median(A) M2 = 11 12 13 14 15 >> M3 = nanmedian(A) M3 = 11 12 13 14 15 >> M4 = geomean(A) M4 = 11.1462 10.9234 8.4557 9.8787 10.7349 >> M5 = harmmean(A) M5 = 9.2045 9.1371 3.8098 6.2969 8.0767 |
2.数据比较
- sort:普通排序
- sortrows:按行排序
- range:求解值域大小
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
>> A = rand(5) A = 0.8147 0.0975 0.1576 0.1419 0.6557 0.9058 0.2785 0.9706 0.4218 0.0357 0.1270 0.5469 0.9572 0.9157 0.8491 0.9134 0.9575 0.4854 0.7922 0.9340 0.6324 0.9649 0.8003 0.9595 0.6787 >> S1 = sort(A) S1 = 0.1270 0.0975 0.1576 0.1419 0.0357 0.6324 0.2785 0.4854 0.4218 0.6557 0.8147 0.5469 0.8003 0.7922 0.6787 0.9058 0.9575 0.9572 0.9157 0.8491 0.9134 0.9649 0.9706 0.9595 0.9340 >> S2 = sortrows(A) S2 = 0.1270 0.5469 0.9572 0.9157 0.8491 0.6324 0.9649 0.8003 0.9595 0.6787 0.8147 0.0975 0.1576 0.1419 0.6557 0.9058 0.2785 0.9706 0.4218 0.0357 0.9134 0.9575 0.4854 0.7922 0.9340 >> S3 = range(A) S3 = 0.7864 0.8673 0.8130 0.8176 0.8983 |
3.方差(即期望 var)、标准差(std)
- var:方差
- std:标准差
- skewness:三阶统计量斜度
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
>> x = randn(8,2) x = 1.0347 -0.8095 0.7269 -2.9443 -0.3034 1.4384 0.2939 0.3252 -0.7873 -0.7549 0.8884 1.3703 -1.1471 -1.7115 -1.0689 -0.1022 >> dx = var(x) dx = 0.8040 2.2308 >> dx1 = var(x,1) dx1 = 0.7035 1.9519 >> s = std(x) s = 0.8967 1.4936 >> s1 = std(x,2) 错误使用 var (line 177) W 必须为非负权重矢量,或者为标量 0 或 1。 出错 std (line 38) y = sqrt(var(varargin{:})); >> s1 = std(x,0) s1 = 0.8967 1.4936 >> s1 = std(x,1) s1 = 0.8388 1.3971 >> sk = skewness(x) sk = -0.0554 -0.3088 >> sk = skewness(x,1) sk = -0.0554 -0.3088 |
4.协方差(cov)与相关系数(corrcoef)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
>> x = ones(1,5) x = 1 1 1 1 1 >> r = rand(5,1) r = 0.3816 0.7655 0.7952 0.1869 0.4898 >> X = ones(5,5) X = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 >> A = magic(5) A = 17 24 1 8 15 23 5 7 14 16 4 6 13 20 22 10 12 19 21 3 11 18 25 2 9 >> C1 = cov(x) C1 = 0 >> C2 = cov(r) C2 = 0.0667 >> C3 = cov(x,r) C3 = 0 0 0 0.0667 >> C4 = cov(r,x) C4 = 0.0667 0 0 0 >> C5 = cov(X) C5 = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> C6 = cov(A) C6 = 52.5000 5.0000 -37.5000 -18.7500 -1.2500 5.0000 65.0000 -7.5000 -43.7500 -18.7500 -37.5000 -7.5000 90.0000 -7.5000 -37.5000 -18.7500 -43.7500 -7.5000 65.0000 5.0000 -1.2500 -18.7500 -37.5000 5.0000 52.5000 >> C7 = corrcoef(x,r) C7 = NaN NaN NaN 1 >> C8 = corrcoef(A,X) C8 = 1 NaN NaN NaN >> C9 = corrcoef(A) C9 = 1.0000 0.0856 -0.5455 -0.3210 -0.0238 0.0856 1.0000 -0.0981 -0.6731 -0.3210 -0.5455 -0.0981 1.0000 -0.0981 -0.5455 -0.3210 -0.6731 -0.0981 1.0000 0.0856 |
五、统计作图
1.正整数频率表(tabulate)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
>> T = ceil(5*rand(1,10)) T = 3 4 4 4 2 4 4 1 1 3 >> table = tabulate(T) table = 1 2 20 2 1 10 3 2 20 4 5 50 |
其第一列为元素,第二列为出现次数,第三列为百分比
2.累计分布函数图形(cdfplot)
- [h, (stats)] = cdfplot(x)
h 为表示曲线的句柄,x 为向量;stats 为样本的一些特征
1 2 3 4 5 6 |
>> y = evrnd(0,3,100,1); >> cdfplot(y) >> hold on >> x = -20 : .1 ; 10; >> f = evcdf(x,0,3); >> plot(x,f,'m') |
3.最小二乘拟合直线(lsline)
- lsline
- h = lsline
h为拟合曲线的句柄,该语句可实现离散数据的最小二乘拟合
1 2 3 4 5 6 7 |
>> x = 1:10; >> y1 = x + randn(1,10); >> scatter(x,y1,25,'b','*') >> hold on >> y2 = 2*x + randn(1,10); >> plot(x,y2,'mo') >> lsline |
4.绘制正态分布概率图形(normplot)
- h = normplot(X)
其中,若X为向量,则显示正态分布概率图形;若X为矩阵,则显示每一列的正态分布
1 2 3 4 |
>> x = normrnd(10,1,25,1); >> normplot(x) %绘制向量对象 >> figure >> normplot([x,1.5*x]) %绘制矩阵对象 |
5.样本数据的盒图(boxplot)
- boxplot(X)
- boxplot(X, G)
- boxplot(axes, X, …)
- boxplot(…, ‘name’, value)
X为待绘制的变量;G为附加变量;axes 为坐标轴句柄;name,value 为可设置属性的属性名和属性值
1 2 3 4 |
>> x = randn(100,25); >> subplot(311),boxplot(x) >> subplot(312),boxplot(x,'plotstyle','compact') >> subplot(313),boxplot(x,'notch','on') |
看不懂盒图是什么
6.绘制参考线
refline 绘制参考直线,reflcurve 绘制参考曲线
- refline(m, b)
- refline(coeffs)
- refline
- hline = refline(…)
m 为斜率、b 为截距;coeffs 为前面两个参数构成的向量;hline 为参考线句柄
1 2 3 4 5 6 7 |
>> x = 1 : 10; >> y = x + randn(1,10); >> scatter(x,y,25,'b','*') >> lsline >> mu = mean(y); >> hline = refline([0 mu]); >> set(hline, 'Color', 'r') |
- reflcurve
- reflcurve(p)
- hcurve = reflcurve(…)
p 为多项式系数向量
1 2 3 4 5 6 7 8 |
>> p = [1 -2 -1 0]; >> t = 0 : .1 : 3; >> y = polyval(p,t) + .5*randn(size(t)); >> plot(t,y,'ro') >> h = refcurve(p); >> set(h,'Color','r') >> q = polyfit(t,y,3); >> refcurve(q) |
7.样本概率图形(capaplot)
- p = capaplot(data, specs)
- [p, h] = capaplot(data, specs)
data 为样本数据,specs 用于指定范围,p表示在指定范围内的概率
该函数返回来自与估计分布的随机变量落在指定范围内的概率
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
>> data = normrnd(3,.005,100,1); >> p1 = capaplot(data,[2.99 3.01]) p1 = 0.9449 >> grid on; axis tight >> figure >> p2 = capaplot(data,[2.995 3.015]) p2 = 0.8037 >> grid on; axis tight |
8.正态拟合直方图(histfit)
- histfit(data)
- histfit(data, nbins)
- histfit(data, nbins, dist)
- h = histfit(…)
data 为向量;nbins 指定bar 的个数;dist为分布类型
1 2 3 4 5 6 7 8 |
>> r = normrnd(10,1,200,1); >> histfit(r) >> h = get(gca,'Children'); >> set(h(2),'FaceColor',[.8 .8 1]) >> figure >> histfit(r,20) >> h = get(gca,'Children'); >> set(h(2),'FaceColor',[.8 .8 1]) |
转载请注明:燕骏博客 » MATLAB自学笔记(二十三):概率统计2
赞赏作者微信赞赏支付宝赞赏