MODEL SELECTION for ``apple taste'' with 5 predictors |
PROC REG of taste on 5 variables |
MODEL RSQUARE=0.8793 (P=0.0016), but NOTHING IS SIGNIFICANT |
in the Parameter Estimate table !! |
NOTE ALSO that the parameter estimates for Na and K |
are large but of opposite sign, even though they are |
found together and have similar chemical effects. |
This may be an example of unreliable estimates of |
parameters when predictors are highly correlated. |
Number of Observations Read | 14 |
---|---|
Number of Observations Used | 14 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 5 | 3577641 | 715528 | 11.65 | 0.0016 |
Error | 8 | 491317 | 61415 | ||
Corrected Total | 13 | 4068957 |
Root MSE | 247.81967 | R-Square | 0.8793 |
---|---|---|---|
Dependent Mean | 2195.42857 | Adj R-Sq | 0.8038 |
Coeff Var | 11.28799 |
Parameter Estimates | ||||||
---|---|---|---|---|---|---|
Variable | Label | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| |
Intercept | Intercept | 1 | 299.64546 | 988.08291 | 0.30 | 0.7694 |
Nat | Sodium | 1 | -10.77226 | 76.56324 | -0.14 | 0.8916 |
Kk | Potassium | 1 | 43.61828 | 60.00005 | 0.73 | 0.4880 |
Pp | Phosphorus | 1 | 0.34494 | 0.74392 | 0.46 | 0.6552 |
Shade | Shade | 1 | -179.09980 | 1800.82748 | -0.10 | 0.9232 |
Water | Water | 1 | 1.75238 | 3.52645 | 0.50 | 0.6326 |
MODEL SELECTION for ``apple taste'' with 5 predictors |
GENERATING A CORRELATION MATRIX using proc corr |
IN EACH CORRELATION TABLE ENTRY |
The first entry is the estimated Pearson rho |
The second entry is the t-test for H_0:rho=0 |
THE FIRST ROW AND COLUMN are for Response vs. Predictors |
The other entries (among Predictors) show a complex pattern |
of high correlations between predictor variables, |
such as Na vs K (they are bundled together in fertilizers) |
P vs Water, and everything vs Shade. |
6 Variables: | yy Nat Kk Pp Shade Water |
---|
Pearson Correlation Coefficients, N = 14 Prob > |r| under H0: Rho=0 |
||||||
---|---|---|---|---|---|---|
yy | Nat | Kk | Pp | Shade | Water | |
yy AppleTaste |
1.00000 |
0.41070 0.1446 |
0.41574 0.1393 |
0.75356 0.0019 |
0.91704 <.0001 |
0.73599 0.0027 |
Nat Sodium |
0.41070 0.1446 |
1.00000 |
0.95312 <.0001 |
-0.13605 0.6428 |
0.59800 0.0239 |
-0.14548 0.6197 |
Kk Potassium |
0.41574 0.1393 |
0.95312 <.0001 |
1.00000 |
-0.17195 0.5567 |
0.57962 0.0298 |
-0.19159 0.5117 |
Pp Phosphorus |
0.75356 0.0019 |
-0.13605 0.6428 |
-0.17195 0.5567 |
1.00000 |
0.69683 0.0056 |
0.97877 <.0001 |
Shade Shade |
0.91704 <.0001 |
0.59800 0.0239 |
0.57962 0.0298 |
0.69683 0.0056 |
1.00000 |
0.67403 0.0082 |
Water Water |
0.73599 0.0027 |
-0.14548 0.6197 |
-0.19159 0.5117 |
0.97877 <.0001 |
0.67403 0.0082 |
1.00000 |
MODEL SELECTION for ``apple taste'' with 5 predictors |
A SECOND RUN OF PROC REG for VIF scores |
ALL VIF scores for predictors are too large |
(VIF=10 is considered a rule of thumbs.) |
This suggests that the predictors are highly correlated |
and that some if not most should be dropped. |
Fortunately, the output also suggests that there are |
no outliers. |
Number of Observations Read | 14 |
---|---|
Number of Observations Used | 14 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 5 | 3577641 | 715528 | 11.65 | 0.0016 |
Error | 8 | 491317 | 61415 | ||
Corrected Total | 13 | 4068957 |
Root MSE | 247.81967 | R-Square | 0.8793 |
---|---|---|---|
Dependent Mean | 2195.42857 | Adj R-Sq | 0.8038 |
Coeff Var | 11.28799 |
Parameter Estimates | |||||||
---|---|---|---|---|---|---|---|
Variable | Label | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| | Variance Inflation |
Intercept | Intercept | 1 | 299.64546 | 988.08291 | 0.30 | 0.7694 | 0 |
Nat | Sodium | 1 | -10.77226 | 76.56324 | -0.14 | 0.8916 | 26.21534 |
Kk | Potassium | 1 | 43.61828 | 60.00005 | 0.73 | 0.4880 | 80.37797 |
Pp | Phosphorus | 1 | 0.34494 | 0.74392 | 0.46 | 0.6552 | 144.71123 |
Shade | Shade | 1 | -179.09980 | 1800.82748 | -0.10 | 0.9232 | 274.06551 |
Water | Water | 1 | 1.75238 | 3.52645 | 0.50 | 0.6326 | 31.40784 |
MODEL SELECTION for ``apple taste'' with 5 predictors |
A SECOND RUN OF PROC REG for VIF scores |
ALL VIF scores for predictors are too large |
(VIF=10 is considered a rule of thumbs.) |
This suggests that the predictors are highly correlated |
and that some if not most should be dropped. |
Fortunately, the output also suggests that there are |
no outliers. |
Output Statistics | ||||||||
---|---|---|---|---|---|---|---|---|
Obs | Dependent Variable |
Predicted Value |
Std Error Mean Predict |
Residual | Std Error Residual |
Student Residual |
-2-1 0 1 2 | Cook's D |
1 | 2876 | 2545 | 162.2851 | 330.9954 | 187.3 | 1.767 | | |*** | | 0.391 |
2 | 2078 | 2054 | 134.5014 | 24.3789 | 208.1 | 0.117 | | | | | 0.001 |
3 | 3052 | 2921 | 185.3258 | 131.4174 | 164.5 | 0.799 | | |* | | 0.135 |
4 | 2265 | 1962 | 121.7668 | 303.4113 | 215.8 | 1.406 | | |** | | 0.105 |
5 | 940.0000 | 1121 | 209.0136 | -181.2443 | 133.1 | -1.361 | | **| | | 0.761 |
6 | 2815 | 2768 | 163.4178 | 47.4069 | 186.3 | 0.254 | | | | | 0.008 |
7 | 2661 | 2735 | 148.6937 | -73.6539 | 198.3 | -0.372 | | | | | 0.013 |
8 | 2181 | 2279 | 143.1761 | -97.7067 | 202.3 | -0.483 | | | | | 0.019 |
9 | 2052 | 1952 | 151.6323 | 99.5374 | 196.0 | 0.508 | | |* | | 0.026 |
10 | 2064 | 2314 | 136.7896 | -250.1510 | 206.6 | -1.211 | | **| | | 0.107 |
11 | 1551 | 1348 | 137.5019 | 202.6941 | 206.2 | 0.983 | | |* | | 0.072 |
12 | 2338 | 2587 | 135.9839 | -248.8729 | 207.2 | -1.201 | | **| | | 0.104 |
13 | 1753 | 1848 | 219.8169 | -95.4855 | 114.4 | -0.834 | | *| | | 0.428 |
14 | 2110 | 2303 | 185.6468 | -192.7272 | 164.2 | -1.174 | | **| | | 0.294 |
Sum of Residuals | 0 |
---|---|
Sum of Squared Residuals | 491317 |
Predicted Residual SS (PRESS) | 1782663 |
MODEL SELECTION for ``apple taste'' with 5 predictors |
MODEL SELECTION of RESPONSE (YY) for 5 Predictors |
Comparing Stepwise, Backward, and Mallow |
Stepwise Regression finds Na (Sodium) Shade |
Backwards Regression finds K (Potassium) P (Phosphorus) |
Mallow gives a sorted list of models, K P slightly best |
Number of Observations Read | 14 |
---|---|
Number of Observations Used | 14 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 1 | 3421848 | 3421848 | 63.45 | <.0001 |
Error | 12 | 647110 | 53926 | ||
Corrected Total | 13 | 4068957 |
Variable | Parameter Estimate |
Standard Error |
Type II SS | F Value | Pr > F |
---|---|---|---|---|---|
Intercept | 645.72764 | 204.20305 | 539226 | 10.00 | 0.0082 |
Shade | 811.96905 | 101.93130 | 3421848 | 63.45 | <.0001 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 2 | 3541934 | 1770967 | 36.96 | <.0001 |
Error | 11 | 527023 | 47911 | ||
Corrected Total | 13 | 4068957 |
Variable | Parameter Estimate |
Standard Error |
Type II SS | F Value | Pr > F |
---|---|---|---|---|---|
Intercept | 798.46545 | 215.30332 | 658943 | 13.75 | 0.0034 |
Nat | -26.08884 | 16.47880 | 120087 | 2.51 | 0.1417 |
Shade | 925.46000 | 119.87479 | 2855594 | 59.60 | <.0001 |
Summary of Stepwise Selection | |||||||||
---|---|---|---|---|---|---|---|---|---|
Step | Variable Entered |
Variable Removed |
Label | Number Vars In |
Partial R-Square |
Model R-Square |
C(p) | F Value | Pr > F |
1 | Shade | Shade | 1 | 0.8410 | 0.8410 | 0.5367 | 63.45 | <.0001 | |
2 | Nat | Sodium | 2 | 0.0295 | 0.8705 | 0.5814 | 2.51 | 0.1417 |
MODEL SELECTION for ``apple taste'' with 5 predictors |
MODEL SELECTION of RESPONSE (YY) for 5 Predictors |
Comparing Stepwise, Backward, and Mallow |
Stepwise Regression finds Na (Sodium) Shade |
Backwards Regression finds K (Potassium) P (Phosphorus) |
Mallow gives a sorted list of models, K P slightly best |
Number of Observations Read | 14 |
---|---|
Number of Observations Used | 14 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 5 | 3577641 | 715528 | 11.65 | 0.0016 |
Error | 8 | 491317 | 61415 | ||
Corrected Total | 13 | 4068957 |
Variable | Parameter Estimate |
Standard Error |
Type II SS | F Value | Pr > F |
---|---|---|---|---|---|
Intercept | 299.64546 | 988.08291 | 5648.07120 | 0.09 | 0.7694 |
Nat | -10.77226 | 76.56324 | 1215.75074 | 0.02 | 0.8916 |
Kk | 43.61828 | 60.00005 | 32457 | 0.53 | 0.4880 |
Pp | 0.34494 | 0.74392 | 13204 | 0.22 | 0.6552 |
Shade | -179.09980 | 1800.82748 | 607.45977 | 0.01 | 0.9232 |
Water | 1.75238 | 3.52645 | 15165 | 0.25 | 0.6326 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 4 | 3577033 | 894258 | 16.36 | 0.0004 |
Error | 9 | 491924 | 54658 | ||
Corrected Total | 13 | 4068957 |
Variable | Parameter Estimate |
Standard Error |
Type II SS | F Value | Pr > F |
---|---|---|---|---|---|
Intercept | 391.72312 | 325.63626 | 79095 | 1.45 | 0.2597 |
Nat | -16.51251 | 47.45755 | 6617.15339 | 0.12 | 0.7359 |
Kk | 38.09669 | 21.46412 | 172188 | 3.15 | 0.1097 |
Pp | 0.27749 | 0.28833 | 50626 | 0.93 | 0.3610 |
Water | 1.59125 | 2.95493 | 15850 | 0.29 | 0.6033 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 3 | 3570416 | 1190139 | 23.87 | <.0001 |
Error | 10 | 498541 | 49854 | ||
Corrected Total | 13 | 4068957 |
Variable | Parameter Estimate |
Standard Error |
Type II SS | F Value | Pr > F |
---|---|---|---|---|---|
Intercept | 317.02761 | 233.84411 | 91631 | 1.84 | 0.2050 |
Kk | 30.97381 | 6.16202 | 1259634 | 25.27 | 0.0005 |
Pp | 0.29157 | 0.27264 | 57019 | 1.14 | 0.3100 |
Water | 1.42377 | 2.78440 | 13035 | 0.26 | 0.6202 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 2 | 3557381 | 1778690 | 38.25 | <.0001 |
Error | 11 | 511577 | 46507 | ||
Corrected Total | 13 | 4068957 |
Variable | Parameter Estimate |
Standard Error |
Type II SS | F Value | Pr > F |
---|---|---|---|---|---|
Intercept | 337.00028 | 222.68472 | 106512 | 2.29 | 0.1584 |
Kk | 30.61040 | 5.91185 | 1246835 | 26.81 | 0.0003 |
Pp | 0.42795 | 0.05463 | 2854105 | 61.37 | <.0001 |
Summary of Backward Elimination | ||||||||
---|---|---|---|---|---|---|---|---|
Step | Variable Removed |
Label | Number Vars In |
Partial R-Square |
Model R-Square |
C(p) | F Value | Pr > F |
1 | Shade | Shade | 4 | 0.0001 | 0.8791 | 4.0099 | 0.01 | 0.9232 |
2 | Nat | Sodium | 3 | 0.0016 | 0.8775 | 2.1176 | 0.12 | 0.7359 |
3 | Water | Water | 2 | 0.0032 | 0.8743 | 0.3299 | 0.26 | 0.6202 |
MODEL SELECTION for ``apple taste'' with 5 predictors |
MODEL SELECTION of RESPONSE (YY) for 5 Predictors |
Comparing Stepwise, Backward, and Mallow |
Stepwise Regression finds Na (Sodium) Shade |
Backwards Regression finds K (Potassium) P (Phosphorus) |
Mallow gives a sorted list of models, K P slightly best |
Number of Observations Read | 14 |
---|---|
Number of Observations Used | 14 |
Number in Model |
C(p) | R-Square | Variables in Model |
---|---|---|---|
2 | 0.3299 | 0.8743 | Kk Pp |
1 | 0.5367 | 0.8410 | Shade |
2 | 0.5814 | 0.8705 | Nat Shade |
2 | 0.8473 | 0.8665 | Pp Shade |
2 | 0.8496 | 0.8664 | Shade Water |
2 | 1.0461 | 0.8635 | Kk Water |
2 | 1.1990 | 0.8612 | Kk Shade |
3 | 2.1176 | 0.8775 | Kk Pp Water |
3 | 2.2680 | 0.8752 | Nat Kk Pp |
3 | 2.3222 | 0.8744 | Kk Pp Shade |
3 | 2.4351 | 0.8727 | Nat Kk Shade |
3 | 2.5548 | 0.8709 | Nat Pp Shade |
3 | 2.5813 | 0.8705 | Nat Shade Water |
3 | 2.6948 | 0.8688 | Kk Shade Water |
3 | 2.8147 | 0.8670 | Pp Shade Water |
3 | 2.8342 | 0.8667 | Nat Kk Water |
2 | 2.8513 | 0.8362 | Nat Pp |
4 | 4.0099 | 0.8791 | Nat Kk Pp Water |
4 | 4.0198 | 0.8790 | Kk Pp Shade Water |
4 | 4.2150 | 0.8760 | Nat Kk Shade Water |
2 | 4.2187 | 0.8156 | Nat Water |
4 | 4.2469 | 0.8755 | Nat Kk Pp Shade |
4 | 4.5285 | 0.8713 | Nat Pp Shade Water |
3 | 4.8136 | 0.8368 | Nat Pp Water |
5 | 6.0000 | 0.8793 | Nat Kk Pp Shade Water |
1 | 18.6318 | 0.5678 | Pp |
1 | 20.3651 | 0.5417 | Water |
2 | 20.6280 | 0.5679 | Pp Water |
1 | 44.8026 | 0.1728 | Kk |
1 | 45.0784 | 0.1687 | Nat |
2 | 46.6515 | 0.1751 | Nat Kk |
MODEL SELECTION for ``apple taste'' with 5 predictors |
CHECKING THE CONSENSUS BEST MODEL: |
VIF scores are much smaller |
Parameter estimates are all positive |
Individual parameter estimates are all significant |
The output suggests that there are still no outliers. |
Number of Observations Read | 14 |
---|---|
Number of Observations Used | 14 |
Analysis of Variance | |||||
---|---|---|---|---|---|
Source | DF | Sum of Squares |
Mean Square |
F Value | Pr > F |
Model | 2 | 3557381 | 1778690 | 38.25 | <.0001 |
Error | 11 | 511577 | 46507 | ||
Corrected Total | 13 | 4068957 |
Root MSE | 215.65474 | R-Square | 0.8743 |
---|---|---|---|
Dependent Mean | 2195.42857 | Adj R-Sq | 0.8514 |
Coeff Var | 9.82290 |
Parameter Estimates | |||||||
---|---|---|---|---|---|---|---|
Variable | Label | DF | Parameter Estimate |
Standard Error |
t Value | Pr > |t| | Variance Inflation |
Intercept | Intercept | 1 | 337.00028 | 222.68472 | 1.51 | 0.1584 | 0 |
Kk | Potassium | 1 | 30.61040 | 5.91185 | 5.18 | 0.0003 | 1.03047 |
Pp | Phosphorus | 1 | 0.42795 | 0.05463 | 7.83 | <.0001 | 1.03047 |
MODEL SELECTION for ``apple taste'' with 5 predictors |
CHECKING THE CONSENSUS BEST MODEL: |
VIF scores are much smaller |
Parameter estimates are all positive |
Individual parameter estimates are all significant |
The output suggests that there are still no outliers. |
Output Statistics | ||||||||
---|---|---|---|---|---|---|---|---|
Obs | Dependent Variable |
Predicted Value |
Std Error Mean Predict |
Residual | Std Error Residual |
Student Residual |
-2-1 0 1 2 | Cook's D |
1 | 2876 | 2565 | 112.7706 | 311.0664 | 183.8 | 1.692 | | |*** | | 0.359 |
2 | 2078 | 2018 | 75.4518 | 60.0722 | 202.0 | 0.297 | | | | | 0.004 |
3 | 3052 | 2927 | 103.7734 | 124.8913 | 189.0 | 0.661 | | |* | | 0.044 |
4 | 2265 | 1929 | 65.2153 | 336.4415 | 205.6 | 1.637 | | |*** | | 0.090 |
5 | 940.0000 | 1171 | 150.6759 | -231.3620 | 154.3 | -1.500 | | **| | | 0.715 |
6 | 2815 | 2811 | 91.1217 | 3.7117 | 195.5 | 0.0190 | | | | | 0.000 |
7 | 2661 | 2685 | 101.8988 | -24.3510 | 190.1 | -0.128 | | | | | 0.002 |
8 | 2181 | 2341 | 60.1426 | -160.3518 | 207.1 | -0.774 | | *| | | 0.017 |
9 | 2052 | 1963 | 75.6404 | 89.2776 | 202.0 | 0.442 | | | | | 0.009 |
10 | 2064 | 2285 | 82.1294 | -221.1845 | 199.4 | -1.109 | | **| | | 0.070 |
11 | 1551 | 1345 | 119.4113 | 205.6548 | 179.6 | 1.145 | | |** | | 0.193 |
12 | 2338 | 2552 | 102.9772 | -214.0711 | 189.5 | -1.130 | | **| | | 0.126 |
13 | 1753 | 1797 | 73.4737 | -43.9520 | 202.8 | -0.217 | | | | | 0.002 |
14 | 2110 | 2346 | 135.4746 | -235.8430 | 167.8 | -1.406 | | **| | | 0.429 |
Sum of Residuals | 0 |
---|---|
Sum of Squared Residuals | 511577 |
Predicted Residual SS (PRESS) | 983500 |