DISCRIMINANT ANALYSIS - Ferns in two meadows 1 A FIRST LOOK AT THE DATA 05:21 Thursday, November 10, 2005 NOTE THAT THE MEAN DIFFERENCES ARE GREATEST FOR Y2 and Y4 The MEANS Procedure N type Obs Variable Mean Std Dev ------------------------------------------------------- OO 22 y1 42.0454545 12.2376362 y2 68.8181818 15.1331750 y3 44.2727273 8.1426673 y4 83.2272727 14.8289672 XX 14 y1 48.3571429 8.0823373 y2 86.7857143 10.3416899 y3 41.5000000 13.0428289 y4 66.9285714 15.7551666 ------------------------------------------------------- DISCRIMINANT ANALYSIS - Ferns in two meadows 2 A FIRST LOOK AT THE DATA 05:21 Thursday, November 10, 2005 TYPE 1 SEEMS TO HAVE HIGH Y2 and LOW Y4 Plot of y2*y4. Symbol is value of type. y2 | 120 + | | | O | | | 100 + X X | X X | X | X | X O O | | X XX O X 80 + O O | | X X O O | O | | X O O O | O O O 60 + O O | O | O | O | O O | | 40 + ---+------------+------------+------------+------------+------------+-- 20 40 60 80 100 120 y4 DISCRIMINANT ANALYSIS - Ferns in two meadows 3 TWO-SAMPLE T-TESTS 05:21 Thursday, November 10, 2005 ONLY Y2 and Y4 ARE SIGNIFICANTLY DIFFERENT The TTEST Procedure Statistics Lower CL Upper CL Lower CL Variable type N Mean Mean Mean Std Dev Std Dev y1 OO 22 36.62 42.045 47.471 9.415 12.238 y1 XX 14 43.691 48.357 53.024 5.8593 8.0823 y1 Diff (1-2) -13.84 -6.312 1.2188 8.7671 10.839 y2 OO 22 62.109 68.818 75.528 11.643 15.133 y2 XX 14 80.815 86.786 92.757 7.4972 10.342 y2 Diff (1-2) -27.35 -17.97 -8.586 10.923 13.503 y3 OO 22 40.662 44.273 47.883 6.2646 8.1427 y3 XX 14 33.969 41.5 49.031 9.4555 13.043 y3 Diff (1-2) -4.38 2.7727 9.9259 8.3277 10.295 y4 OO 22 76.652 83.227 89.802 11.409 14.829 y4 XX 14 57.832 66.929 76.025 11.422 15.755 y4 Diff (1-2) 5.745 16.299 26.852 12.287 15.19 Statistics Upper CL Variable type Std Dev Std Err Minimum Maximum y1 OO 17.488 2.6091 25 68 y1 XX 13.021 2.1601 33 63 y1 Diff (1-2) 14.201 3.7055 y2 OO 21.626 3.2264 48 111 y2 XX 16.661 2.7639 66 100 y2 Diff (1-2) 17.692 4.6166 y3 OO 11.636 1.736 31 56 y3 XX 21.013 3.4858 18 68 y3 Diff (1-2) 13.489 3.5198 y4 OO 21.192 3.1615 57 113 y4 XX 25.382 4.2107 34 95 y4 Diff (1-2) 19.902 5.1931 T-Tests Variable Method Variances DF t Value Pr > |t| y1 Pooled Equal 34 -1.70 0.0976 y1 Satterthwaite Unequal 33.9 -1.86 0.0711 y2 Pooled Equal 34 -3.89 0.0004 y2 Satterthwaite Unequal 33.8 -4.23 0.0002 y3 Pooled Equal 34 0.79 0.4363 y3 Satterthwaite Unequal 19.5 0.71 0.4849 y4 Pooled Equal 34 3.14 0.0035 y4 Satterthwaite Unequal 26.6 3.10 0.0046 DISCRIMINANT ANALYSIS - Ferns in two meadows 4 TWO-SAMPLE T-TESTS 05:21 Thursday, November 10, 2005 ONLY Y2 and Y4 ARE SIGNIFICANTLY DIFFERENT The TTEST Procedure Equality of Variances Variable Method Num DF Den DF F Value Pr > F y1 Folded F 21 13 2.29 0.1265 y2 Folded F 21 13 2.14 0.1599 y3 Folded F 13 21 2.57 0.0530 y4 Folded F 13 21 1.13 0.7797 DISCRIMINANT ANALYSIS - Ferns in two meadows 5 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Observations 36 DF Total 35 Variables 4 DF Within Classes 34 Classes 2 DF Between Classes 1 Class Level Information Variable Prior type Name Frequency Weight Proportion Probability OO OO 22 22.0000 0.611111 0.500000 XX XX 14 14.0000 0.388889 0.500000 Pooled Covariance Matrix Information Natural Log of the Covariance Determinant of the Matrix Rank Covariance Matrix 4 19.62464 DISCRIMINANT ANALYSIS - Ferns in two meadows 6 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Pairwise Generalized Squared Distances Between Groups 2 _ _ -1 _ _ D (i|j) = (X - X )' COV (X - X ) i j i j Generalized Squared Distance to type From type OO XX OO 0 4.83504 XX 4.83504 0 Linear Discriminant Function _ -1 _ -1 _ Constant = -.5 X' COV X Coefficient Vector = COV X j j j Linear Discriminant Function for type Variable OO XX Constant -38.30350 -38.64957 y1 0.54383 0.50420 y2 0.08486 0.24756 y3 0.39885 0.38016 y4 0.36339 0.23393 DISCRIMINANT ANALYSIS - Ferns in two meadows 7 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Results for Calibration Data: WORK.FERNS Resubstitution Results using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j j j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Posterior Probability of Membership in type From Classified Obs type into type OO XX 1 XX XX 0.0324 0.9676 2 XX XX 0.0761 0.9239 3 XX XX 0.0385 0.9615 4 XX XX 0.2232 0.7768 5 XX XX 0.0664 0.9336 6 XX XX 0.1109 0.8891 7 XX XX 0.0457 0.9543 8 XX XX 0.4994 0.5006 9 XX XX 0.0132 0.9868 10 XX XX 0.1281 0.8719 11 XX XX 0.2110 0.7890 12 XX XX 0.0022 0.9978 13 XX XX 0.0207 0.9793 14 XX OO * 0.8587 0.1413 15 OO OO 0.9647 0.0353 16 OO OO 0.9837 0.0163 17 OO OO 0.9994 0.0006 18 OO OO 0.9986 0.0014 19 OO XX * 0.3313 0.6687 20 OO OO 0.5931 0.4069 21 OO OO 0.9684 0.0316 22 OO OO 0.9466 0.0534 23 OO OO 0.9408 0.0592 24 OO OO 0.5947 0.4053 25 OO OO 0.7836 0.2164 26 OO OO 0.9758 0.0242 27 OO XX * 0.1001 0.8999 28 OO OO 0.9186 0.0814 29 OO OO 0.9928 0.0072 30 OO OO 0.9937 0.0063 31 OO OO 0.6214 0.3786 32 OO OO 0.8352 0.1648 33 OO OO 0.9173 0.0827 34 OO XX * 0.4450 0.5550 35 OO OO 0.8805 0.1195 DISCRIMINANT ANALYSIS - Ferns in two meadows 8 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Results for Calibration Data: WORK.FERNS Resubstitution Results using Linear Discriminant Function Posterior Probability of Membership in type From Classified Obs type into type OO XX 36 OO OO 0.7547 0.2453 * Misclassified observation DISCRIMINANT ANALYSIS - Ferns in two meadows 9 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Summary for Calibration Data: WORK.FERNS Resubstitution Summary using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j j j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Number of Observations and Percent Classified into type From type OO XX Total OO 19 3 22 86.36 13.64 100.00 XX 1 13 14 7.14 92.86 100.00 Total 20 16 36 55.56 44.44 100.00 Priors 0.5 0.5 Error Count Estimates for type OO XX Total Rate 0.1364 0.0714 0.1039 Priors 0.5000 0.5000 DISCRIMINANT ANALYSIS - Ferns in two meadows 10 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Results for Calibration Data: WORK.FERNS Cross-validation Results using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j (X)j (X) (X)j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Posterior Probability of Membership in type From Classified Obs type into type OO XX 8 XX OO * 0.5382 0.4618 14 XX OO * 0.9497 0.0503 19 OO XX * 0.0926 0.9074 24 OO XX * 0.4727 0.5273 27 OO XX * 0.0046 0.9954 34 OO XX * 0.3186 0.6814 * Misclassified observation DISCRIMINANT ANALYSIS - Ferns in two meadows 11 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Summary for Calibration Data: WORK.FERNS Cross-validation Summary using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j (X)j (X) (X)j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Number of Observations and Percent Classified into type From type OO XX Total OO 18 4 22 81.82 18.18 100.00 XX 2 12 14 14.29 85.71 100.00 Total 20 16 36 55.56 44.44 100.00 Priors 0.5 0.5 Error Count Estimates for type OO XX Total Rate 0.1818 0.1429 0.1623 Priors 0.5000 0.5000 DISCRIMINANT ANALYSIS - Ferns in two meadows 12 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Results for Test Data: WORK.MOREDAT Classification Results using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j j j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Posterior Probability of Membership in type Classified Obs into type OO XX 1 XX 0.0005 0.9995 2 OO 0.9421 0.0579 3 OO 0.9678 0.0322 4 XX 0.0021 0.9979 DISCRIMINANT ANALYSIS - Ferns in two meadows 13 LINEAR DISCRIMINANT ANALYSIS 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Summary for Test Data: WORK.MOREDAT Classification Summary using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j j j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Number of Observations and Percent Classified into type OO XX Total Total 2 2 4 50.00 50.00 100.00 Priors 0.5 0.5 DISCRIMINANT ANALYSIS - Ferns in two meadows 14 LINEAR DISCRIMINANT ANALYSIS USING ONLY Y2 AND Y4 THERE ARE FEWER CROSSVALIDATION ERRORS! 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Observations 36 DF Total 35 Variables 2 DF Within Classes 34 Classes 2 DF Between Classes 1 Class Level Information Variable Prior type Name Frequency Weight Proportion Probability OO OO 22 22.0000 0.611111 0.500000 XX XX 14 14.0000 0.388889 0.500000 Pooled Covariance Matrix Information Natural Log of the Covariance Determinant of the Matrix Rank Covariance Matrix 2 10.49186 DISCRIMINANT ANALYSIS - Ferns in two meadows 15 LINEAR DISCRIMINANT ANALYSIS USING ONLY Y2 AND Y4 THERE ARE FEWER CROSSVALIDATION ERRORS! 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Pairwise Generalized Squared Distances Between Groups 2 _ _ -1 _ _ D (i|j) = (X - X )' COV (X - X ) i j i j Generalized Squared Distance to type From type OO XX OO 0 4.67737 XX 4.67737 0 Linear Discriminant Function _ -1 _ -1 _ Constant = -.5 X' COV X Coefficient Vector = COV X j j j Linear Discriminant Function for type Variable OO XX Constant -20.33151 -22.91663 y2 0.26108 0.41137 y4 0.27270 0.15139 DISCRIMINANT ANALYSIS - Ferns in two meadows 16 LINEAR DISCRIMINANT ANALYSIS USING ONLY Y2 AND Y4 THERE ARE FEWER CROSSVALIDATION ERRORS! 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Results for Calibration Data: WORK.FERNS Resubstitution Results using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j j j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Posterior Probability of Membership in type From Classified Obs type into type OO XX 8 XX OO * 0.5493 0.4507 14 XX OO * 0.8154 0.1846 19 OO XX * 0.1555 0.8445 27 OO XX * 0.0565 0.9435 * Misclassified observation DISCRIMINANT ANALYSIS - Ferns in two meadows 17 LINEAR DISCRIMINANT ANALYSIS USING ONLY Y2 AND Y4 THERE ARE FEWER CROSSVALIDATION ERRORS! 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Summary for Calibration Data: WORK.FERNS Resubstitution Summary using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j j j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Number of Observations and Percent Classified into type From type OO XX Total OO 20 2 22 90.91 9.09 100.00 XX 2 12 14 14.29 85.71 100.00 Total 22 14 36 61.11 38.89 100.00 Priors 0.5 0.5 Error Count Estimates for type OO XX Total Rate 0.0909 0.1429 0.1169 Priors 0.5000 0.5000 DISCRIMINANT ANALYSIS - Ferns in two meadows 18 LINEAR DISCRIMINANT ANALYSIS USING ONLY Y2 AND Y4 THERE ARE FEWER CROSSVALIDATION ERRORS! 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Results for Calibration Data: WORK.FERNS Cross-validation Results using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j (X)j (X) (X)j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Posterior Probability of Membership in type From Classified Obs type into type OO XX 8 XX OO * 0.5808 0.4192 14 XX OO * 0.8977 0.1023 19 OO XX * 0.1127 0.8873 27 OO XX * 0.0060 0.9940 * Misclassified observation DISCRIMINANT ANALYSIS - Ferns in two meadows 19 LINEAR DISCRIMINANT ANALYSIS USING ONLY Y2 AND Y4 THERE ARE FEWER CROSSVALIDATION ERRORS! 05:21 Thursday, November 10, 2005 The DISCRIM Procedure Classification Summary for Calibration Data: WORK.FERNS Cross-validation Summary using Linear Discriminant Function Generalized Squared Distance Function 2 _ -1 _ D (X) = (X-X )' COV (X-X ) j (X)j (X) (X)j Posterior Probability of Membership in Each type 2 2 Pr(j|X) = exp(-.5 D (X)) / SUM exp(-.5 D (X)) j k k Number of Observations and Percent Classified into type From type OO XX Total OO 20 2 22 90.91 9.09 100.00 XX 2 12 14 14.29 85.71 100.00 Total 22 14 36 61.11 38.89 100.00 Priors 0.5 0.5 Error Count Estimates for type OO XX Total Rate 0.0909 0.1429 0.1169 Priors 0.5000 0.5000