
实验报告10距离判别.doc
13页实验十距离判别一、实验目的和要求掌握距离判别分析的理论与方法、模型的建立与误差率估计;掌握利用判别分析的SAS 过程解决有关实际问题.实验要求:编写程序,结果分析.实验内容:要求:1题必做,2, 3, 4题可选1-2题1. 写出几种距离公式,两总体距离判别准则;欧氏距离 d(x,y) =(兀-”)2P 丄明氏距离 ^(x,y) = [£(x,-y,)m]™Z=1(1) 总体G,均值向量p,协方差矩阵X, x,y来自G丄J(x,y) = [(x-y)rL_1(x-y)]2 x,y 的马氏距离d(x,G) = [(x - g)r L-' (x - ——x 与 G 的马氏距离(2) 两个总体GpG2,均值向量卩1,卩2,协方差矩阵均为丫丄J(G1,G2) = [(n1-g2)rL-1(gl —皿)]弓——总体G|, G?的马氏距离1.距离判别准则Gx,G2为两个P维已知总体,均值向量卩1,卩2,协方差矩阵丫1,丫2,X = (“ , *2,…,6)『为待判样品,距离判别准则为x e Gt, 若 d(x,GJ V d(x,G2)x e G2, 若 d(x,GJ〉d(x,G2)2. 书上5. 3data examp5_l; /*建立训练样本集*/input group $ xl x2 x3 x4 x5 x6 x7 x8; /* 输入总体(二维)、数量指标xl - x8 */cards;G1 & 35 23. 53 7. 51 8. 62 17.42 10. 00 1. 04 11. 21G1 8. 19 30. 50 4. 72 9. 78 16. 28 7. 60 2. 52 10. 32G17. 7329. 205.429.4319.29&492. 5210.00G19.4227. 93& 20& 1416.179.421. 559.76G19. 1627. 989. 019. 3215.999.101. 8211.35G110. 062& 6410. 5210. 0516.18&391. 9610.81G19. 092& 127.409. 6217.2611.122. 4912.65G19.412& 205. 7710. 8016.3611.561. 5312.17G18. 702& 127. 2110. 5319.4513.301. 6611.96G16. 9329. 854. 549.4916.6210.651. 8813.61G18. 6736. 057. 317. 7516.6711.682. 3812.88G19. 9837. 697. 018. 9416.1511.080. 8311.67G16. 773& 696. 018. 8214.7911.441. 7413.23G1& 1437. 759. 618.4913.159.761. 2811.28G17. 6735. 718. 04& 3115.137.761. 4113.25G17. 9039. 778.4912. 9419.2711.052. 0413.29G17. 1840. 917. 328. 9417.6012.751. 1414.80G18. 8233. 707. 5910. 981&8214.731. 7810.10G16. 2535. 024. 726. 2810.037.151. 9310.39G210. 6052.417. 709. 9812.5311.702. 3114.69G27. 2752. 653. 849. 1613.0315.261. 9814.57G213.4555. 855. 507.459.559.522. 2116.30G210. 8544. 687. 3214. 5117.1312.081. 2611.57G27. 2145. 797. 6610. 3616.5612.862. 2511.69G27. 68 50. 37 :L1. 35 :L3. 30 :L9. 215 :L4. Ei9 2. 75 :14. E!7G27. 7848.448. 0020. 5122.1215.731. 1516.61run;data test5_l;致)*//*建立检验样本集(变量应和训练样本集一input xl x2x3 x4x5 x6x7 x8;cards;7. 94 39. 6520. 9720. 8222. 52 12.41 1. 75 7. 908. 28 64. 3422. 2220. 06 15. 12 0. 72 22. 8912.47 76. 39 5. 52 11. 24 14. 52 22. 00 5.46 25. 50 run;/*调用判别分析的discrim过程,data=examp5_l训练样本集, testdata=test5_l检验样本集,pool=yes假定各总体的协方差矩阵 相等.method=normal在各总体为正态分布的假定下通过利用训练样 本估计各总体均值向量和协方差矩阵,listerr仅打印回判中判错的 样品信息,crosslisterr对训练样本数据进行交叉确认回判分 析.Testlist列出对检验数据集各样品的判别结果,wcov pcov打 印examp5_l和test5_l集对应的训练样本协方差矩阵估计.*/proc discrim data=examp5_ltestdata=test5_l pool=yes method=normal listerr crosslisterr testlistwcov pcov;class group;/*分类变量group */var xl~x8;/*参与分析的变量xl - x8 */priors equal;/*总体的先验概率相等*/run;样本协方差矩阵:The SAS SystemThe DISCRIM Procedure21:30 Monday, DecemtObservations27 DF Total26Variables8 DF Within C1 asses25C1 asses2 DF Between C1 asses1groupVariableNameFrequencyWeightProport ionPriorProbabi1ityG1G12020.00000.7407410.500000G2G277.00000.2592590.500000Class LevelInformat ionx50.80854737-2.988736840.338610532.167231584.861600002.371078950.16386316x60.327727111.860615530.203923421.204357372.371078953.92039447-0.20731632x7-0.05620368-0.44746368-0.212000530.106404210.16386316-0.207316320.23514632x8-0.554575793.89018316・0.019785260.263955790.304521050.98705158-0.12042842I詡 Output ・(Untitled)The SAS System21:30 Monday, DeThe DISCRIM ProcedureWithin・Class Covarianee Matricesgroup = G1, DF = 19Variablexlx2x3x4x5x6xl1.14196079-2.486372890.885661320.432012110.808547370.32772711-0.056203x2-2.4863728928.165826050.60424447-0.28940053-2.988738841.86061553-0.447463x30.885661320.604244472.638531320.406725790.338610530.20392342-0.212000x40.43201211-0.289400530.406725791.907556842.167231581.204357370,106404x50.80854737-2.988736840.338610532.167231584.861600002.371078950.163863x60.327727111.860615530.203923421.204357372.371078953.92039447-0.207316x?-0.05620368-0.44746368-0.212000530.106404210.16386316-0.207316320.235146x8-0.554575793.89018316-0.019785260.263955790.304521050.98705158-0,120428group = G2, DF = 6Variablexlx2x3x4x5x6xl5.788190484.06045952-1.37709524-3.90320476-6.55061190・4.75851905-0.046790x24.0604595215.94082381-3.32630476-9.71492857-11.77165476-2.618130951.068807x3-1.37709524-3.326304765.39728095。
