*************************************************************; * A three-factor ANOVA with 4 observations per cell: * * Assume that we make measurements of steel quality at various * levels of three different factors, carbonation (3 levels), * pressure (3 levels), and temperature (2 levels). Four * observations (steel samples) are made at each combination of the * three factors, for a total of 3x3x2x4 = 72 samples and * 3x3x2 = 18 cells. * * In more detail, the three factors (and their levels) are * Carbonation: C10 C12 C14 * Pressure: P25 P35 P45 * Temperature: Cool Hot * * Thus (C10,P35,Hot) is one cell, where C stands for carbonation * level and P for pressure. * * In general, a full factorial model with three factors has * 3 main effects (which we call carb, pres, temp in this case) * 3 two-way interactions (carb*pres, carb*temp, pres*temp) and * 1 three-way interaction (carb*pres*temp) * * The three-way interaction is essentially the interaction between * any pair of the three two-way interactions. These seven effects * together are equivalent to a one-way ANOVA on all of the cells. * As before, a one-way ANOVA for cells can be used to see how * the cells vary. * * In this example, we might ask, which of the 3 main effects are * significant? Are any of the 3 two-way interactions significant? * Is the 3-way interaction significant? For the significant * interactions, what do the interactions mean in terms of the * mean values of the cells of the factors involved? * * Finally, which of the 18 cells have the highest expected steel * quality? * * * 2^3 FACTORIAL DESIGNS: Suppose that we had three factors A,B,C * with two levels of each factor. View the 8 cell means as a * three-dimensional 2x2x2 table. The numerator SSs for the * 7 effects are the same constant times the square of the * 2x2x2 cell-mean matrix with the matrices * * I A B A*B * B1 B2 B1 B2 B1 B2 B1 B2 * -------------------------------------------------- * C1: 1 1 -1 -1 -1 1 1 -1 * 1 1 1 1 -1 1 -1 1 * -------------------------------------------------- * C2: 1 1 -1 -1 -1 1 1 -1 * 1 1 1 1 -1 1 -1 1 * -------------------------------------------------- * * * C C*A C*B C*A*B * B1 B2 B1 B2 B1 B2 B1 B2 * -------------------------------------------------- * C1: 1 1 -1 -1 -1 1 1 -1 * 1 1 1 1 -1 1 -1 1 * -------------------------------------------------- * C2: -1 -1 1 1 1 -1 -1 1 * -1 -1 -1 -1 1 -1 1 -1 * -------------------------------------------------- * * * where `C1' and `C2' represent the two layers of the 2x2x2 * matrices in three dimensions. This shows graphically why * the three-way interaction (C*A*B) can be viewed as the * interaction between C and A*B. * * * SAS NOTES: * (1) The condensed datalines block below allows us to list all * 72 observations clearly in just 9 lines of code. In contrast, * `proc print' output for the dataset has 72 lines of output. * (2) In proc glm below, * * model yy = carb | pres | temp; * * tells SAS to look at the full factorial model (carb, pres, temp, * and all of their interactions). It is safer to specify the model * this way than to try to list all 7 effects explicitly and risk * leaving one out. * * * Stanley Sawyer, Washington University, November 16, 2005 *************************************************************; title 'THREE-WAY ANOVA FOR STEEL QUALITY - YOURNAME'; options ls=75 ps=60 pageno=1 nocenter; data msteel; retain carb pres temp; input xx$ @@; * Since `Cool' and `C10' both begin with `C', ; * we must test `Temp' before `Carb' ; if xx='Cool' or xx='Hot' then temp=xx; else if substr(xx,1,1)='C' then carb=xx; else if substr(xx,1,1)='P' then pres=xx; else do; yy=input(xx,12.0); output; end; * Give expanded names for the three variables; label carb='Carbonation' pres='Pressure' temp='Temperature'; drop xx; datalines; C10 P25 Cool 185 208 187 208 Hot 190 191 195 189 P35 Cool 211 190 205 182 Hot 202 189 203 199 P45 Cool 180 187 196 208 Hot 197 236 220 206 C12 P25 Cool 185 197 184 186 Hot 191 179 177 177 P35 Cool 196 197 198 188 Hot 209 202 201 207 P45 Cool 215 211 197 208 Hot 240 202 224 229 C14 P25 Cool 192 206 180 200 Hot 194 183 186 183 P35 Cool 197 190 192 191 Hot 219 215 190 191 P45 Cool 186 197 228 230 Hot 228 235 223 229 ; *************************************************************; * The right hand-side in the model statement expands to the 7 * effects of a full factorial model with three factors. * We also compare the Main Effects of pressure (at alpha=0.01) * and temperature (at alpha=0.05). * * We write the predicted values and residuals to another dataset * for a later analysis of the validity of the model assumptions. *************************************************************; proc glm data=msteel; classes carb pres temp; model yy = carb | pres | temp; means pres / duncan alpha=0.01; means temp / duncan; output out=msteelrr p=predict r=resid; run; *************************************************************; * We note that the only significant effects are Pressure * (the main effect), the two 2-way interactions of Pressure with * Carbonation and Temperature, and the main effect of Temperature. * * Pressure seems to be the driving force here, since most effects * involving pressure are significant. To get a better idea about * how pressure interacts with the other two factors, let's look at * interaction plots for * pressure x temperature and * pressure x carbonation * * Note that the means in these interaction plots are automatically * averaged over all levels of the 3rd factor, Carbonation * or Temperature (respectively). *************************************************************; * Decrease the page height to make nicer lineprinter plots; options ps=35; proc means nway noprint data=msteel; title2 'PRES x TEMP INTERACTION PLOT'; classes pres temp; var yy; output out=prtemp mean=; run; proc plot data=prtemp; plot yy*pres=temp; run; *************************************************************; * Now Pressure x Carbonation: Note that the default SAS dataset is * now `prtemp', so that `data=msteel' here is important. *************************************************************; proc means nway noprint data=msteel; title2 'PRES x CARB INTERACTION PLOT'; classes pres carb; var yy; output out=prcarb mean=; *************************************************************; * We have a plotting problem for Carbon: * * The levels of carb are C10, C12, C14, so that the first letter of * `carb' is `C' in all case. If we make an interaction plot with * `carb' as the plotting symbol, then all symbols will be `C' and * we wouldn't be able to tell them apart. * We reopen the data set `prcarb' to add a new variable for `carb' * with distinct first letters for the three carbonization levels. *************************************************************; data prcarb; set prcarb; carbsym=substr(carb,3,1); * 0 or 2 or 4; run; proc plot data=prcarb; title2 'PRES x CARB INTERACTION PLOT'; plot yy*pres=carbsym; run; *************************************************************; * While we're at it, let's also plot the residuals against the * predicted values. * `Msteelrr' is the original dataset (msteel) with residual and * predicted values added. * Another test would be to look at the three residual x factor level * plots. *************************************************************; proc plot data=msteelrr; title2 'RESIDUAL x PREDICTED VALUE PLOT'; plot resid*predict; run; * Restore the previous page height; options ps=60; *************************************************************; * Let's also test normality of the residual. *************************************************************; proc univariate normal plot data=msteelrr; title2 'RESIDUALS FOR THE THREE-WAY ANOVA'; var resid; run; *************************************************************; * Finally, do a one-way ANOVA on cells to compare output levels at * various combinations of the three factors. * Note the approach to building cell names. We could have defined * cell=pres||carb||temp, but that would have made the cell names * too long and awkward to display properly. *************************************************************; data msteel; set msteel; * A buffer to define a unique cell name; cell="PPP_CCC_TEMP"; * Or any string with _ at posns 4 and 8; substr(cell,1,3)=pres; * Rewrite PPP as P35 etc ; substr(cell,5,3)=carb; * Rewrite CCC as C10 etc ; substr(cell,9,4)=temp; * Rewrite TEMP as Hot or Cool; run; proc glm; title2 "ONE-WAY ANOVA ON CELLS"; class cell; model yy = cell; means cell / duncan; run;