**********************************************************; * A `Saturated' 8-run design extended by MIRRORING or * FOLDING IN ALL COLUMNS * * The basic 2_{III}^{7-4} fractional factorial design has 8 runs * with High/Low assignments for 7 factors A B C D E F G given by * * Dummy: a b c ab ac bc abc * Factor: A B C D E F G * (1) -1 -1 -1 1 1 1 -1 * (2) 1 -1 -1 -1 -1 1 1 * (3) -1 1 -1 -1 1 -1 1 * (4) 1 1 -1 1 -1 -1 -1 * (5) -1 -1 1 -1 -1 -1 1 * (6) 1 -1 1 1 1 -1 -1 * (7) -1 1 1 1 -1 1 -1 * (8) 1 1 1 -1 1 1 1 * * (See FracFac74.sas.) This has the basic defining relations * * D=AB E=AC F=BC * * with a complete set of defining relations * * I = ABD = ACE = BCF = CDG = BEG = AFG = DEF * = ABCG = ABEF = ACDF = ADEG = BCDE = BDFG = CEFG = ABCDEFG * * We can augment this design by MIRRORING or FOLDING IN ALL COLUMNS * by adding 8 more runs with assignments * * Dummy: -a -b -c -ab -ac -bc -abc * Factor: A B C D E F G * (9) 1 1 1 -1 -1 -1 1 * (10) -1 1 1 1 1 -1 -1 * (11) 1 -1 1 1 -1 1 -1 * (12) -1 -1 1 -1 1 1 1 * (13) 1 1 -1 1 1 1 -1 * (14) -1 1 -1 -1 -1 1 1 * (15) 1 -1 -1 -1 1 -1 1 * (16) -1 -1 -1 1 -1 -1 -1 * * Each 16x1 column in the augmented design is of the form * (A(over) -A)' where A is an 8x1 column. If (for example) * * (A1)(A2)...(Ab)=I * * is a relation in the 8-run design, then * * (A1(over -A1)(A2(over) -A2)... = (I(over) (-1)^b I) * * is the corresponding identity in the 16-run design. This * means that the defining relations for the 16-run design * are exactly the relations of EVEN LENGTH in the 8-run * design, or * * I = ABCG = ABEF = ACDF = ADEG = BCDE = BDFG * * In particular, the 16-run fully mirrored 2^{7-3} design has * resolution IV, or is a 2_{IV}^{7-3} design. * * This means that estimates of main effects are confounded at worst * with 3-way interactions, and 2-way interactions are confounded * at worst only with 2-way interactions. Ignoring 3-way and * higher interactions, we obtain unaliased estimates of * * (Main effects) A B C D E F G * (Two-way interactions) * BD = CE = FG * AD = CF = EG * AE = BF = DG * AB = EF = CG * AC = DF = BG * BC = DE = AG * CD = BE = AF * * We apply this to an experiments at a water filtration plant * with 7 factors * * A=Water_Supply (Low=Reservoir H=Well) * B=Raw_Material (on site or off site) * C=Temperature * D=Recycle * E=Caustic_Soda (Fast or slow) * F=Filter_Cloth * G=Holdup_Time * * The response variable was filtration time, so that low values * were desirable. Observation of the data for the first 8 runs * shows that the two lowest times were only with * * A=C=E=High * * This suggested that the factors A C E were active. However, due to * the confounding patterns in the basic 2_{III}^{7-3} design, it * was possible that only two of A C E were active and estimates * of the third main effect were confounded with a significant * interaction of the two other effects. This suggested four * distinct possibilities for major effects that might have * led to the observed output: * * (i) A C E (ii) A C AC (iii) A E AE (iv) C E CE * * The 8-run 2_{III}^{7-3} was augmented by 8 more runs to form the * fully mirrored Resolution IV 2_{IV}^{7-3} design. This allowed * estimates of main effects and their interactions to be * distinguished, and suggested case (iii) above. * * See Tables 6.10-6.13 (pages 252-257) in text * **********************************************************; options ls=75 ps=60 nocenter pageno=1; title1 'A 2_{III}^{7-4} design with 8 observations - YOUR NAME'; title2 'Analysis of the basic design suggests 4 distinct possibilities'; title3 ' (i) A C E (ii) A C AC (iii) A E AE (iv) C E CE'; title4 'for major effects. We do 8 more runs to obtain a 16-run'; title5 ' 2_{IV}^{7-3} design that strongly suggests (iii).'; title6 'See text Tables 6.10 p253 and Table 6.12 p254'; data filtration; * yy1 are observations for first set of 8 runs; * yy2 are observations for the second set of 8 runs ; input Row A B C yy1 yy2; * Define High/Low relations for seven factors; D=A*B; E=A*C; F=B*C; G=A*B*C; * Column yy1 is the first group of observations; Yield=yy1; Rep=1; output; * Column yy2 is the second group of 8 observations ; * resulting from foldover in all columns ; A=-A; B=-B; C=-C; D=-D; E=-E; F=-F; G=-G; Yield=yy2; Rep=2; Row=Row+8; output; * Descriptive names for the 7 factors; label A=Water_Supply B=Raw_Material C=Temperature D=Recycle E=Caustic_Soda F=Filter_Cloth G=Holdup_Time; datalines; 1 -1 -1 -1 68.4 66.7 2 1 -1 -1 77.7 65.0 3 -1 1 -1 66.4 86.4 4 1 1 -1 81.0 61.9 5 -1 -1 1 78.6 47.8 6 1 -1 1 41.2 59.0 7 -1 1 1 68.7 42.6 8 1 1 1 38.7 67.6 run; **********************************************************; * Sort to keep the rows and replications in the proper order **********************************************************; proc sort; by Rep Row; run; proc print; title6 'The data as SAS sees it:'; title7 ' D E F G are functions of A B C'; title8 'Levels are reflected in all columns in the second replication'; run; title2 'In the first replication, A, C, and E all seem to be large.'; title3 'The saturated 2_{iii}^{7-3} design has confounding relations'; title4 ' A=A+BD+CE+FG B=B+AD+CF+EG C=C+AE+BF+DG,'; title5 ' D=D+AB+CG+EF E=E+AC+BG+DF F=F+AG+BC+DE G=G+AF+BE+CD'; title6 'Thus E could be an artifact of AC, C of AE, A of CE, or'; title7 ' possibly only the main effects are important. Which of the four?'; title8 'Foldover in all columns allows us to separate the odd-order (main)'; title9 ' effects from the even-order (2-way) effects'; **********************************************************; * Estimate parameters with and without foldover * `by Rep' tells SAS to estimate regression parameters twice, * once within each replication * * The command `outest=parmrows in `proc reg' tells SAS to write * the effect parameter estimates for the two replications * to `data=parmrows as two rows. * * See FracFac74.sas for more comments. **********************************************************; proc reg data=filtration outest=parmrows; by Rep; model Yield = A B C D E F G; run; data parmrows; set parmrows; if Rep=1 then Name="Parms1"; else if Rep=2 then Name="Parms2"; run; proc transpose data=parmrows out=coldata name=Effect; id Name; run; data coldata; set coldata; Book1=2*Parms1; Book2=2*Parms2; if Effect ne "Intercept" and Effect ne "_RMSE_" and Effect ne "Yield" and Effect ne "Rep" then output; run; proc print; title3 "PARAMETER AND BOOK PARAMETER VALUES IN TWO COLUMNS"; title4 '(See as Table 6.11 p253)'; title5 'In the second replication, C seems to be large'; title6 ' but in the opposite direction'; run; **********************************************************; * Compute the average and semi-difference to separate * the main effects from the second-order interactions **********************************************************; data coldata; set coldata; * Averages and Semi-Differences for foldover; Average=(Book1+Book2)/2; SemiDf=(Book1-Book2)/2; run; proc print; title3 'The average gives the main effects: C is now small'; title4 'Semi-Df is the sum of three 2nd-order interactions'; title5 'The large effects now seem to be A E and AE'; title6 'AE is confounded with 2 other 2-way interations,'; title7 ' but the most parsimonious explanation is that all'; title8 ' factors other than A and E are inert.'; run; **********************************************************; * Let's check our conclusions by doing a 2^3 design with * two observations per cell on A C E: **********************************************************; data filtration; set filtration; AC=A*C; AE=A*E; CE=C*E; ACE=A*C*E; run; proc reg data=filtration; title2 'ANALYSIS OF 2^3 DESIGN (2 OBSERVATIONS/CELL) FOR A C E'; title3 'Only E AE are significant, with only A borderline'; model Yield = A C E AC AE CE ACE; run; title2 'INTERACTION PLOT of A and E'; proc means nway noprint data=filtration; class A E; var Yield; output out=cmeans mean=; run; data cmeans; set cmeans; if A<0 then APLOT="L"; else APLOT="H"; if E<0 then EPLOT="L"; else EPLOT="H"; label APLOT=Water_Supply EPLOT=Caustic_Soda; run; proc plot data=cmeans; plot Yield*A = Eplot / vpos=30; run; title2 'AN ALTERNATIVE DERIVATION OF PARAMETER ESTIMATES:'; title3 "For the 16 runs together, `Average' estimates' are"; title4 " `main effect' estimates and `Semi-Df' estimates"; title5 ' are interactions with Rep (the Replication number)'; title6 'This gives the 16 parameter estimates directly.'; title7 'The main effect of C is not significant'; title8 'RepC, which is a standin for AE, is strongly significant'; data filtall; set filtration; * Rep has to be +-1 for this to work; if Rep=1 then Rep=-1; else if Rep=2 then Rep=1; * Define 8 replication interaction effects; RepA=Rep*A; RepB=Rep*B; RepC=Rep*C; RepD=Rep*D; RepE=Rep*E; RepF=Rep*F; RepG=Rep*G; ; proc reg data=filtall outest=filrows; * Carry out estimates for both replications together; model Yield = A B C D E F G Rep RepA RepB RepC RepD RepE RepF RepG; run; proc transpose data=filrows out=filcols name=Effect; id _TYPE_; run; data filcols; set filcols; BookParm=2*Parms; AbsParms=abs(Parms); if Effect ne "Intercept" and Effect ne "_RMSE_" and Effect ne "Yield" then output; run; proc sort data=filcols; by descending AbsParms; run; proc print data=filcols; title2 'SORTED PARAMETER ESTIMATES for 2_{IV}^{7-3} design'; run; title2 'IN THE SECOND REPLICATION, runs 10,11,13,16 have a complete sets'; title3 ' of levels for factors A C E. This means that we can resolve'; title4 ' C from AE with just 12 runs.'; title5 'Since a 2^3 design has 8 parameters, we can use these 12 runs'; title6 ' to get P-values for A C E and all of their interactions'; title7 ' with error-df=4'; **********************************************************; * Keep all of the runs in the first replication, * but only runs 10,11,13,16 from the second replication **********************************************************; data filt12; set filtration; if Rep=1 then output; else if Row=10 or Row=11 or Row=13 or Row=16 then output; run; proc print data=filt12; title8 'THE 12-RUN DATASET AS SAS SEES IT:'; var Rep Row A B C D E F G yy1 yy2 Yield; run; proc reg data=filt12; title5 'With 12 observations, only E AE are significant'; title6 ' and C is not significant, with A CE bordlerline'; title7 'Thus the additional half-replication was enough'; title8 ' to resolve the C=AE ambiguity'; model Yield = A C E AC AE CE ACE; run; proc glm data=filt12; title6 'PROC GLM FOR THE 2^3 DESIGN WITH 12 OBSERVATIONS'; title7 'This shows that the design is unbalanced,'; title8 " but the Type-III table conclusions are the same."; classes A C E; model Yield = A | C | E; run;