***************************************************************; * A THREE-WAY ANOVA for assembly quality for computers, with factors * that are not quite crossed. * * The three factors are: * (Assembly) Method: M1 M2 M3 * (Workplace) Shift: MShift EShift (Morning/Evening shift) * Operators: Op1 Op2 Op3 Op4 * Op5 Op6 Op7 Op8 * * Four operators were contracted for each shift: * Morning: Op1 Op2 Op3 Op4 * Evening: Op5 Op6 Op7 Op8 * * The morning operators (Op1-Op4) work only on the morning shift, * while the evening operators (Op5-Op8) work only on the evening * shift. * * Each operator does 2 assemblies for each of 3 methods in a * randomized order. The total number of observations is * * 8 (operators) x 3 (methods) x 2 (runs) = 48. * * Operator is crossed with Method, since each operator has values for * all three methods. In contrast, Operator is NESTED under Shift, * since each operator has values for only one shift. * * Method and Shift are CROSSED factors, since each method is done * (by someone) on each shift. The third factor Operator is * CROSSED with Method but NESTED within Shift. * * Treating Op (incorrectly) as a third CROSSED factor with 4 levels * would effectively treat Op1 and Op5, Op2 and Op6, etc as * identical twins. In some cases this does not matter. For * example, the SS term for Method sums all Operator contributions * for each Method across all shifts, and Shift sums all Operator * contributions within each Shift and contrasts the difference * between shifts. Thus both effects do not care how or if Operators * are paired across Shifts. However, the SS term for Op (viewed as * having 4 levels across both Shifts) is ambiguous, and can have * as many as 24 different values corresponding to 24 different * pairings of the 4 Morning-shift operators with the 4 * Evening-shift Operators. Shift*Op is ambiguous for the same * reason. In contrast, Op(Shift) is not ambiguous, since it * correctly treats Operator as having 4 levels within each Shift, * and then sums the result over the two shifts. * * The error SS (SSE) is not affected by the nesting of Op, since, in * any full factorial model, SSE is a sum of squares of within-cell * contributions. In this case, SSE is the sum of the squares of * the differences between the two replications for all Method*Op * combinations. * * Some questions are inherently unanswerable with data from a nested * design. For example, the model cannot tell the difference between * a significant contribution for Shift and a significant difference * between the average efficiencies of the four morning Operators * (Op1 Op2 Op3 Op4) and the four evening Operators * (Op5 Op6 Op7 Op8). For example, a significant Shift effect could * be due to better workers migrating to the Morning shift for a * variety of reasons, or it could be due to better lighting levels * during the Day shift. If the owners of a small factory are more * likely to be around during the day than during the evening, then * Morning workers may be more efficient due to more training, and * either Morning or Evening workers may be more efficient due to * better morale. However, there might also be a significant Shift * effect that is simply due to a chance sampling of 4 Morning * operators that are more efficient (or less efficient) than their * Evening counterparts. * * Although Op (when viewed as having 4 levels across both Shifts, * i.e. by artificially twinning Op1 with Op5, etc) and Op*Shift * are undefined or ambiguous, their sum * * Op(Shift) = Op + Op*Shift * * is not ambiguous. Similarly, Op*Method and Op*Method*Shift * measure the extent to which the contribution of individuals (or * their interactions with Shift) are nonadditive across Shifts. * Both of these may differ depending on how Operators are paired * across Shifts. However, Op*Method(Shift), which sums Op*Method * (defined within each Shift) across Shifts, is not ambiguous. * Also * * Op*Method(Shift) = Op*Method + Op*Method*Shift * * for any set of pairings of operators across shifts. * * In general, a full-factorial model with 3 crossed factors A, B, * and Op has 7 effects: * Main (3): A B Op * 2-way interactions (3): A*B A*Op B*Op * 3-way interaction : A*B*Op * * If A and B are crossed, but Op is crossed with A but nested * under B, then OP, A*Op, B*Op, and A*B*Op are either ambiguous or * meaningless. However, we can replace * Op B*Op by Op(B) * A*Op A*B*Op by A*Op(B) * * which are unambiguous and defined only by Operator differences * within Shifts. * * This leads to a full-factorial model with 5 effects: * Main (2): A B * Simple 2-way interaction A*B * Nested main effect Op(B) * Mested interaction A*Op(B) * * (Here A=Method, B=Shift, and Op=Operator) * * In this case, we have 2 observations for each cell, so that we can * test all 5 effects. If we had only one observation per cell, we * could use * * Op(B) + A*Op(B) * * (more exactly SS[Op(B)}+SS[A*Op(B)]) to test A, B, and A*B, in * a similar way as in a Split-Plot analysis. However, with a * partially nested model of this type, it is generally considered * better to use Op(B) to test B and to use A*Op(B) to test A and * A*B since the assumptions needed for each test are less. We will * discuss this more when we consider nested models with one * observation per cell. * * (Data adapted from Montgomery, * `Design and Analysis of Experiments, 1991, p453, where * `shift' was a second assembly-method factor and `qual' was * assembly time.) * ***************************************************************; title "WORKPLACE METHODS, SHIFTS, and OPERATORS"; title2 "TWO CROSSED factors with an Operator factor that is"; title3 " NESTED within one factor and CROSSED with the other"; options ls=75 ps=60 pageno=1 nocenter; data nfactory; retain Method Shift Op; input xx$ @@; * We must be a lttle careful here, since M1 M2 M3 and MShift ; * all begin with M; if substr(xx,2,6)='Shift' then Shift=xx; else if substr(xx,1,1)='M' then Method=xx; else if substr(xx,1,2)='Op' then Op=xx; else do; Qual=input(xx,12.0); output; end; drop xx; datalines; M1 MShift Op1 22 24 Op2 23 24 Op3 28 29 Op4 25 23 EShift Op5 26 28 Op6 27 25 Op7 28 25 Op8 24 23 M2 MShift Op1 30 27 Op2 29 28 Op3 30 32 Op4 27 25 EShift Op5 29 28 Op6 30 27 Op7 24 23 Op8 28 30 M3 MShift Op1 25 21 Op2 24 22 Op3 27 25 Op4 26 23 EShift Op5 27 25 Op6 26 24 Op7 24 27 Op8 28 27 ; proc print; title4 'THE DATA AS SAS SEES IT'; run; ***************************************************************; * The nested linear model: ***************************************************************; proc glm; title4 'GLM NESTED ANALYSIS WITH FIVE EFFECTS'; classes Method Shift Op; model Qual = Method Shift Method*Shift Op(Shift) Method*Op(Shift); means Method / duncan alpha=0.001; run; ***************************************************************; * Method and Op(Shift) in the output are highly significant (P<0.01) * and Method*Shift and Method*Op(Shift) are borderline significant * (0.01 E on M2, but M < E for * M1 and M3. The plot suggests that M2 may be better than M1 and * M3 even on the Evening shift, but this is not as clear. * * The nested GLM output shows that Method*Op(Shift) (P=0.0360) is * almost as significant as Method*Shift (P=0.0298), so that we * should guard against the significant of Method*Shift being * an artifact of the choices of Operator, or perhaps even * caused by a single Operator. * * Let's show an Op*Method interaction plot for each Shift. If * either plot shows something interesting, following up with * the individual operators may give information about the causes * if the Method*Shift behavior in the last interaction plot. * * In fact, this analysis shows that the Method*Shift and * Op*Method(Shift) interactions are most likely due to a single * operator, Op7, who had very poor results with Method M2. This * should be investigated before any decisions are made, to see * if the reasons are generic or perhaps the result of a poor * explanation of M2 on the evening shift, or possibly the * methods were explained when Op7 was absent or on vacation. * * Note that, within each Shift, the design reduces to a two-way * layout with two crossed factors (Method and Op) with two * observations per cell. * * Since we need exactly the same analysis for each Shift, and to * give another example of using macros in SAS, we first define a * ``SAS macro'' that displays an interaction plot for one shift * and then apply it to each of the two shifts. * * We define a variable MethPlot with plotting symbols 1,2,3 for * Methods M1,M2,M3, and also define shift numbers 1,2 for Morning * and Evening shifts. We need plotting symbols 1,2,3 for the * interaction plots and the syntax for SAS macros is easier with * numerical macro arguments, here shifts =1,2 instead of * Mshift and Eshift. ***************************************************************; title4 "INTERACTION PLOT for METHOD*OP WITHIN EACH SHIFT"; data nfactory; set nfactory; MethPlot=substr(Method,2,1); /* 1 or 2 or 3 */ * Define a `shift number' =1 for morning and =2 for afternoon; if Shift="MShift" then Shiftnum=1; else if Shift="EShift" then Shiftnum=2; else Shiftnum=3; * This shouldn't happen; run; ***************************************************************; * The following defines a macro named `op_macro' with one macro * argument `Snum' (for shift). The MACRO DEFINITION does not do * any actual calculations nor does it produces any output. That * only happens when the macro is CALLED or INVOKED, for example by * * %op_macro(Snum=1) * * (for the morning shift). Invoking or calling a macro seems to be * the only instance in SAS where a statement is not followed by * a semicolon. * * You can also define macros with more than one or no arguments, as in * * %macro yearend (Snum=, Year=, Target=); * or; * %macro simple; * with no macro arguments; * ***************************************************************; * Start the macro definition: ; * The macro definition ends when you say %mend op_macro; %macro op_macro (Snum=); * Subset 'nfactory' to a dataset with Shiftnum=Snum records only; data nshift; set nfactory; * Note how the macro argument ; if Shiftnum=&Snum; * Snum is invoked; run; * We must say MethPlot rather than Method to make sure that ; * MethPlot is included in the means dataset; proc means data=nshift nway noprint; class MethPlot Op; var Qual; output out=mscells mean= ; run; * Since Op has 4 levels, we put it on the X axis; * Generally speaking, you can use either "" or '' to define text * strings, as in "This is a string" or 'This is a string'. * However, macros are only expanded within a string when you use * ""s. If the title statement below used '' instead of "", * &Snum would be displayed as a literal &Snum, which would not * be very informative. ; proc plot data=mscells; title5 "METHOD*OP on SHIFT=&Snum"; plot Qual*Op = MethPlot / vpos=25; run; %mend op_macro; * End of definition of macro Opmeth_Macro; ***************************************************************; * The code above doesn't do anything. It just defines a macro. * To actually do something, we have to invoke it: ; ***************************************************************; %op_macro(Snum=1) %op_macro(Snum=2)