REGRESSION for taste on 5 covariates - YOURNAME 1 PROC REG of taste on 5 variables 20:00 Monday, October 3, 2005 EVEN THOUGH THE MODEL HAS P=0.0016 WITH Rsquare=0.88, NONE of the INDIVIDUAL PARAMETER ESTIMATES are ANYWHERE NEAR significant ! At least, as a consolation, there are no apparent problems with the Studentized residuals. The REG Procedure Model: MODEL1 Dependent Variable: yy Apple Taste Number of Observations Read 14 Number of Observations Used 14 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 5 3577641 715528 11.65 0.0016 Error 8 491317 61415 Corrected Total 13 4068957 Root MSE 247.81967 R-Square 0.8793 Dependent Mean 2195.42857 Adj R-Sq 0.8038 Coeff Var 11.28799 Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Pr > |t| Intercept Intercept 1 299.64546 988.08291 0.30 0.7694 nat Sodium 1 -10.77226 76.56324 -0.14 0.8916 kk Potassium 1 43.61828 60.00005 0.73 0.4880 pp Phosphorus 1 0.34494 0.74392 0.46 0.6552 shade Shade 1 -179.09980 1800.82748 -0.10 0.9232 water Water 1 1.75238 3.52645 0.50 0.6326 REGRESSION for taste on 5 covariates - YOURNAME 2 PROC REG of taste on 5 variables 20:00 Monday, October 3, 2005 EVEN THOUGH THE MODEL HAS P=0.0016 WITH Rsquare=0.88, NONE of the INDIVIDUAL PARAMETER ESTIMATES are ANYWHERE NEAR significant ! At least, as a consolation, there are no apparent problems with the Studentized residuals. The REG Procedure Model: MODEL1 Dependent Variable: yy Apple Taste Output Statistics Dependent Predicted Std Error Std Error Student Obs Variable Value Mean Predict Residual Residual Residual 1 2876 2545 162.2851 330.9954 187.3 1.767 2 2078 2054 134.5014 24.3789 208.1 0.117 3 3052 2921 185.3258 131.4174 164.5 0.799 4 2265 1962 121.7668 303.4113 215.8 1.406 5 940.0000 1121 209.0136 -181.2443 133.1 -1.361 6 2815 2768 163.4178 47.4069 186.3 0.254 7 2661 2735 148.6937 -73.6539 198.3 -0.372 8 2181 2279 143.1761 -97.7067 202.3 -0.483 9 2052 1952 151.6323 99.5374 196.0 0.508 10 2064 2314 136.7896 -250.1510 206.6 -1.211 11 1551 1348 137.5019 202.6941 206.2 0.983 12 2338 2587 135.9839 -248.8729 207.2 -1.201 13 1753 1848 219.8169 -95.4855 114.4 -0.834 14 2110 2303 185.6468 -192.7272 164.2 -1.174 Output Statistics Cook's Obs -2-1 0 1 2 D 1 | |*** | 0.391 2 | | | 0.001 3 | |* | 0.135 4 | |** | 0.105 5 | **| | 0.761 6 | | | 0.008 7 | | | 0.013 8 | | | 0.019 9 | |* | 0.026 10 | **| | 0.107 11 | |* | 0.072 12 | **| | 0.104 13 | *| | 0.428 14 | **| | 0.294 Sum of Residuals 0 Sum of Squared Residuals 491317 Predicted Residual SS (PRESS) 1782663 REGRESSION for taste on 5 covariates - YOURNAME 3 CORRELATIONS of taste WITH explanatory variables and BETWEEN PAIRS OF explanatory variables 20:00 Monday, October 3, 2005 The CORR Procedure 6 Variables: yy nat kk pp shade water Simple Statistics Variable N Mean Std Dev Sum yy 14 2195 559.46110 30736 nat 14 14.15714 4.59644 198.20000 kk 14 21.35714 10.27025 299.00000 pp 14 2815 1111 39410 shade 14 1.90857 0.63186 26.72000 water 14 278.21429 109.23100 3895 Simple Statistics Variable Minimum Maximum Label yy 940.00000 3052 Apple Taste nat 7.50000 20.40000 Sodium kk 4.00000 38.00000 Potassium pp 233.00000 4408 Phosphorus shade 0.84000 2.79000 Shade water 18.00000 453.00000 Water Pearson Correlation Coefficients, N = 14 Prob > |r| under H0: Rho=0 yy nat kk pp shade water yy 1.00000 0.41070 0.41574 0.75356 0.91704 0.73599 Apple Taste 0.1446 0.1393 0.0019 <.0001 0.0027 nat 0.41070 1.00000 0.95312 -0.13605 0.59800 -0.14548 Sodium 0.1446 <.0001 0.6428 0.0239 0.6197 kk 0.41574 0.95312 1.00000 -0.17195 0.57962 -0.19159 Potassium 0.1393 <.0001 0.5567 0.0298 0.5117 pp 0.75356 -0.13605 -0.17195 1.00000 0.69683 0.97877 Phosphorus 0.0019 0.6428 0.5567 0.0056 <.0001 shade 0.91704 0.59800 0.57962 0.69683 1.00000 0.67403 Shade <.0001 0.0239 0.0298 0.0056 0.0082 water 0.73599 -0.14548 -0.19159 0.97877 0.67403 1.00000 Water 0.0027 0.6197 0.5117 <.0001 0.0082 REGRESSION for taste on 5 covariates - YOURNAME 4 APPLY PROC GLM to the same data 20:00 Monday, October 3, 2005 SOME effects are significant in the Type I table, but NOTHING IS SIGNIFICANT in the Type III table. Predicted and residual values are saved for later analysis. The GLM Procedure Number of Observations Read 14 Number of Observations Used 14 REGRESSION for taste on 5 covariates - YOURNAME 5 APPLY PROC GLM to the same data 20:00 Monday, October 3, 2005 SOME effects are significant in the Type I table, but NOTHING IS SIGNIFICANT in the Type III table. Predicted and residual values are saved for later analysis. The GLM Procedure Dependent Variable: yy Apple Taste Sum of Source DF Squares Mean Square F Value Pr > F Model 5 3577640.728 715528.146 11.65 0.0016 Error 8 491316.700 61414.588 Corrected Total 13 4068957.429 R-Square Coeff Var Root MSE yy Mean 0.879252 11.28799 247.8197 2195.429 Source DF Type I SS Mean Square F Value Pr > F nat 1 686340.032 686340.032 11.18 0.0102 kk 1 26220.418 26220.418 0.43 0.5318 pp 1 2848622.576 2848622.576 46.38 0.0001 shade 1 1292.354 1292.354 0.02 0.8882 water 1 15165.348 15165.348 0.25 0.6326 Source DF Type III SS Mean Square F Value Pr > F nat 1 1215.75074 1215.75074 0.02 0.8916 kk 1 32456.76875 32456.76875 0.53 0.4880 pp 1 13204.17776 13204.17776 0.22 0.6552 shade 1 607.45977 607.45977 0.01 0.9232 water 1 15165.34810 15165.34810 0.25 0.6326 Standard Parameter Estimate Error t Value Pr > |t| Intercept 299.6454569 988.082906 0.30 0.7694 nat -10.7722597 76.563241 -0.14 0.8916 kk 43.6182762 60.000052 0.73 0.4880 pp 0.3449429 0.743922 0.46 0.6552 shade -179.0997997 1800.827477 -0.10 0.9232 water 1.7523772 3.526446 0.50 0.6326 REGRESSION for taste on 5 covariates - YOURNAME 6 Full-model residuals by Predicted Values and Covariates 20:00 Monday, October 3, 2005 Plot of resid*pred. Legend: A = 1 obs, B = 2 obs, etc. resid | 400 + | | A | A | | | 200 + A | | A | | A | A | A 0 + | | | A A A | | | A -200 + A | | A A | | | | -400 + ---+------------+------------+------------+------------+-- 1000 1500 2000 2500 3000 pred REGRESSION for taste on 5 covariates - YOURNAME 7 NORMAL DIAGNOSTICS for residuals 20:00 Monday, October 3, 2005 The UNIVARIATE Procedure Variable: resid Moments N 14 Sum Weights 14 Mean 0 Sum Observations 0 Std Deviation 194.405741 Variance 37793.5923 Skewness 0.36052368 Kurtosis -1.0151076 Uncorrected SS 491316.7 Corrected SS 491316.7 Coeff Variation . Std Error Mean 51.9571199 Basic Statistical Measures Location Variability Mean 0.0000 Std Deviation 194.40574 Median -24.6375 Variance 37794 Mode . Range 581.14641 Interquartile Range 312.66177 Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 0 Pr > |t| 1.0000 Sign M 0 Pr >= |M| 1.0000 Signed Rank S 0.5 Pr >= |S| 1.0000 Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.938993 Pr < W 0.4056 Kolmogorov-Smirnov D 0.147607 Pr > D >0.1500 Cramer-von Mises W-Sq 0.03905 Pr > W-Sq >0.2500 Anderson-Darling A-Sq 0.282565 Pr > A-Sq >0.2500 Quantiles (Definition 5) Quantile Estimate 100% Max 330.9954 99% 330.9954 95% 330.9954 90% 303.4113 75% Q3 131.4174 50% Median -24.6375 25% Q1 -181.2443 10% -248.8729 5% -250.1510 1% -250.1510 0% Min -250.1510 REGRESSION for taste on 5 covariates - YOURNAME 8 NORMAL DIAGNOSTICS for residuals 20:00 Monday, October 3, 2005 The UNIVARIATE Procedure Variable: resid Extreme Observations ------Lowest------ ------Highest----- Value Obs Value Obs -250.1510 10 99.5374 9 -248.8729 12 131.4174 3 -192.7272 14 202.6941 11 -181.2443 5 303.4113 4 -97.7067 8 330.9954 1 Stem Leaf # Boxplot 3 03 2 | 2 0 1 | 1 03 2 +-----+ 0 25 2 | + | -0 7 1 *-----* -1 9800 4 +-----+ -2 55 2 | ----+----+----+----+ Multiply Stem.Leaf by 10**+2 Normal Probability Plot 350+ * +*+++ | * +++++ | +*+++ 50+ +*+*+* | *+*+* | +*+*+ -250+ * +++*+ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2 REGRESSION for taste on 5 covariates - YOURNAME 9 Regression of YY on Na P only 20:00 Monday, October 3, 2005 Both Na and P are highly significant in both the Type I and Type III tables and also in the Parameter Estimate table. This means that the numerical effects of Na and P levels on taste can be trusted If two correlated variables are not individually significant, then the numerical effects of either cannot be resolved. The GLM Procedure Number of Observations Read 14 Number of Observations Used 14 REGRESSION for taste on 5 covariates - YOURNAME 10 Regression of YY on Na P only 20:00 Monday, October 3, 2005 Both Na and P are highly significant in both the Type I and Type III tables and also in the Parameter Estimate table. This means that the numerical effects of Na and P levels on taste can be trusted If two correlated variables are not individually significant, then the numerical effects of either cannot be resolved. The GLM Procedure Dependent Variable: yy Apple Taste Sum of Source DF Squares Mean Square F Value Pr > F Model 2 3402530.181 1701265.091 28.08 <.0001 Error 11 666427.247 60584.295 Corrected Total 13 4068957.429 R-Square Coeff Var Root MSE yy Mean 0.836217 11.21142 246.1388 2195.429 Source DF Type I SS Mean Square F Value Pr > F nat 1 686340.032 686340.032 11.33 0.0063 pp 1 2716190.150 2716190.150 44.83 <.0001 Source DF Type III SS Mean Square F Value Pr > F nat 1 1091984.524 1091984.524 18.02 0.0014 pp 1 2716190.150 2716190.150 44.83 <.0001 Standard Parameter Estimate Error t Value Pr > |t| Intercept 125.8072820 299.8488985 0.42 0.6829 nat 63.6461622 14.9914627 4.25 0.0014 pp 0.4151238 0.0619980 6.70 <.0001 REGRESSION for taste on 5 covariates - YOURNAME 11 Regression of YY on Na P Shade only 20:00 Monday, October 3, 2005 Even though the Model R^2 is higher with the 3rd covariate, Shade is not significant in the Type I table and destroys the significance of Na and P in the Type III table. Thus (Na P) is enough, and (Na P Shade) is too much. The GLM Procedure Number of Observations Read 14 Number of Observations Used 14 REGRESSION for taste on 5 covariates - YOURNAME 12 Regression of YY on Na P Shade only 20:00 Monday, October 3, 2005 Even though the Model R^2 is higher with the 3rd covariate, Shade is not significant in the Type I table and destroys the significance of Na and P in the Type III table. Thus (Na P) is enough, and (Na P Shade) is too much. The GLM Procedure Dependent Variable: yy Apple Taste Sum of Source DF Squares Mean Square F Value Pr > F Model 3 3543568.765 1181189.588 22.48 <.0001 Error 10 525388.663 52538.866 Corrected Total 13 4068957.429 R-Square Coeff Var Root MSE yy Mean 0.870879 10.44049 229.2136 2195.429 Source DF Type I SS Mean Square F Value Pr > F nat 1 686340.032 686340.032 13.06 0.0047 pp 1 2716190.150 2716190.150 51.70 <.0001 shade 1 141038.584 141038.584 2.68 0.1324 Source DF Type III SS Mean Square F Value Pr > F nat 1 17963.6562 17963.6562 0.34 0.5717 pp 1 1634.4135 1634.4135 0.03 0.8635 shade 1 141038.5840 141038.5840 2.68 0.1324 Standard Parameter Estimate Error t Value Pr > |t| Intercept 885.226166 541.1144935 1.64 0.1329 nat -36.747013 62.8441736 -0.58 0.5717 pp -0.051225 0.2904272 -0.18 0.8635 shade 1034.612650 631.4648244 1.64 0.1324 REGRESSION for taste on 5 covariates - YOURNAME 13 Regression of YY on Na P only 20:00 Monday, October 3, 2005 Residuals, Stud.residuals, and Cook's D for each observation: Obs yy resid rstudent cookd 1 2876 444.441 2.41854 0.33173 2 2078 1.179 0.00483 0.00000 3 3052 73.999 0.33966 0.01623 4 2265 274.819 1.18919 0.04236 5 940 -364.516 -2.54890 1.40829 6 2815 -14.543 -0.06260 0.00034 7 2661 -5.985 -0.02638 0.00008 8 2181 -145.982 -0.59914 0.01063 9 2052 66.918 0.27745 0.00383 10 2064 -189.481 -0.81678 0.03689 11 1551 156.349 0.73882 0.07802 12 2338 -165.926 -0.74449 0.05222 13 1753 191.748 0.87726 0.07736 14 2110 -323.019 -1.78594 0.48285 REGRESSION for taste on 5 covariates - YOURNAME 14 DROPPING OBSERVATION #5 WITH THE LARGE COOK'S D VALUE Regression of YY on Na P only 20:00 Monday, October 3, 2005 Note that the model R^2 and regression coefficients are about the same, so that the large CookD value was a false alarm. The GLM Procedure Number of Observations Read 13 Number of Observations Used 13 REGRESSION for taste on 5 covariates - YOURNAME 15 DROPPING OBSERVATION #5 WITH THE LARGE COOK'S D VALUE Regression of YY on Na P only 20:00 Monday, October 3, 2005 Note that the model R^2 and regression coefficients are about the same, so that the large CookD value was a false alarm. The GLM Procedure Dependent Variable: yy Apple Taste Sum of Source DF Squares Mean Square F Value Pr > F Model 2 1967646.074 983823.037 24.35 0.0001 Error 10 403971.926 40397.193 Corrected Total 12 2371618.000 R-Square Coeff Var Root MSE yy Mean 0.829664 8.769220 200.9905 2292.000 Source DF Type I SS Mean Square F Value Pr > F nat 1 1161195.637 1161195.637 28.74 0.0003 pp 1 806450.437 806450.437 19.96 0.0012 Source DF Type III SS Mean Square F Value Pr > F nat 1 1206027.985 1206027.985 29.85 0.0003 pp 1 806450.437 806450.437 19.96 0.0012 Standard Parameter Estimate Error t Value Pr > |t| Intercept 444.6700926 274.9555543 1.62 0.1369 nat 67.3590079 12.3279987 5.46 0.0003 pp 0.3014482 0.0674683 4.47 0.0012 REGRESSION for taste on 5 covariates - YOURNAME 16 DROPPING OBSERVATION #5 WITH THE LARGE COOK'S D VALUE Regression of YY on Na P only 20:00 Monday, October 3, 2005 Note that the model R^2 and regression coefficients are about the same, so that the large CookD value was a false alarm. Residuals, Stud.residuals, and Cook's D for each observation Obs yy resid rstudent cookd 1 2876 334.147 2.27557 0.39128 2 2078 -18.097 -0.09045 0.00036 3 3052 117.568 0.67346 0.06431 4 2265 172.622 0.91030 0.04018 5 2815 49.683 0.26410 0.00663 6 2661 125.776 0.72681 0.07683 7 2181 -164.408 -0.83944 0.02062 8 2052 35.205 0.17788 0.00166 9 2064 -310.486 -1.94624 0.23821 10 1551 9.467 0.05646 0.00070 11 2338 -264.861 -1.64762 0.25834 12 1753 82.031 0.45966 0.02883 13 2110 -168.646 -1.14156 0.33581