Math 439 Homework 4

Math 439 Homework 4 - Fall 2010

HOMEWORK #4 due Tuesday 12-7

Arrange your answers in three parts in the following order:
Part I: Your answers to all questions, either written by hand or using a word processor,
Part II: The SAS programs (*.sas files) that you used for all problems in which you used SAS
Part III: The output from the SAS programs in Part II.
For all problems in which you use SAS, either copy or transcribe answers from the SAS output to Part I or else refer in Part I to specific pages in Part III by saying (for example) ``The scatterplot or matrix for Problem 3 is on page 17 of the SAS output (Part III).'' Make sure that you have consecutive page numbers on the SAS output in Part III by adding your own page numbers to the SAS output if necessary, so that (for example) you don't have several different page 1s in Part III. If you like, you can number pages as (for example) ``Page 3-2'' for the second page of output for Problem 3.

1. The responses of 30 patients to three experimental drugs (named B1, B2, and B3) are given in the following table. M and F stand for male and female, respectively.

      Table 1.  Responses of 30 patients to three drugs
      Sex  Drug   Responses
        F   B1    43   33   35   43   35
        F   B2    53   60   53   53   42
        F   B3    38   35   41   31   38
        M   B1    37   30   30   19   28
        M   B2    22   15   28   18   18
        M   B3    30   35   25   24   22

These drugs are known to be more effective in female patients.

(i) Use SAS to run a two-factor full factorial ANOVA analysis. Is the Model Test significant? What is its P-value?

(ii) Which of the two main effects and one interaction effect are significant? highly significant? What are the P-values of the significant effects?

(iii) If the interaction is significant, display an interaction plot. Is the interaction visible in the interaction plot? What can you conclude about the interaction and how it affects the dependent variable? (Hint: See TwoWayInt.sas and MACorSinDogs.sas for examples of interaction plots.)

2. A chemical engineer is interested in the efficiency of a chemical process as a function of three variables: Drubness, with three levels (Low,Med,High), Turgidity, with three levels (Turg1,Turg2,Turg3), and Time, with two levels (AM,PM). Two independent runs were made for each setting of the three variables, for a total of 3*3*2*2=36 observations. The resulting efficiencies are listed in Table 2.

    Table 2. Efficiencies of a Chemical Process

               Low: AM PM           Med: AM PM          High: AM PM
  Turg1:   755, 370  192,815    1385,2118  458,557    732,1103  1023, 533
  Turg2:  3049,1087  117,509    1407,3095  802,318    431, 592   533,8814
  Turg3:  3289,1517  359,328    2118, 977  541,201    163, 364  1739,1227

(i) Use SAS to run a three-factor full factorial model with the three variables as factors. Is the Model Test significant? What is its P-value?

(ii) Plot the residuals against the predicted values in the model. Do the residuals appear to be independent of the predicted values? Or do their absolute values appear to grow as a function of the predicted value? (Hint: See ThreeRegIml.sas for an example of a residual plot. The output command can be used in proc glm as well as in proc reg.)

(iii) Run the full factorial model again with the values in Table 2 replaced by their logarithms. Is the Model test now more significant? Analyze the residuals of the log-transformed data in the same way as in part (ii). Do they now look more independent of the predicted values?

(iv) Which of the main effects of Drubness, Turgidity, and Time are significant for the log-transformed data? highly significant? Which of the three two-way and one three-way interactions are significant? highly significant? What are the P-values for the significant effects?

(v) For each of the two-way interactions that are significant, display an interaction plot. For each such interaction, is the interaction visible in the interaction plot? What can you conclude about the interaction and how it affects the dependent variable (that is, the chemical efficiency)?

Problem 3. The Midwestern Chess Federation (MCF) conducts a survey to compare the MCF chess ratings of individuals in chess teams, each composed of 5 individual players, that are distributed across five midwestern states. The MCF ratings of individuals are determined by how well the players do in local chess tournaments and chess matches, including MCF team matches. The federation wants to know how how the variation of ratings is distributed, specifically whether most of the variation is within teams (so that most teams would be equally matched), between teams within states, or between states. The survey data is in Table 3.

 Table 3 --- Chess rankings of individuals in 24 chess teams
 State1
    Team1  97  87  96  115  92    Team2  97  100  97  103  110
    Team3  88  84  105  98  81    Team4  105  92  107  101  92
    Team5  110  91  100  97  100
 State2
    Team1  98  95  107  93  102    Team2  100  93  110  106  96
    Team3  110  108  107  103  103
 State3
    Team1  99  92  100  102  101    Team2  104  110  99  94  90
    Team3  96  96  104  101  96    Team4  115  104  116  113  109
    Team5  97  108  91  96  98    Team6  104  97  96  99  101
    Team7  102  99  97  107  92    Team8  103  83  91  87  100
    Team9  101  89  98  102  94
 State4
    Team1  96  92  88  89  107    Team2  101  96  81  102  103
    Team3  103  108  110  98  93
 State5
    Team1  110  100  92  111  102    Team2  95  87  98  100  98
    Team3  102  99  98  107  111    Team4  101  98  89  99  97

Note that ``Team1'' does not refer to the same team in different states, but only to the first team listed for that state.

(i) Using within-team variation to estimate the error, was there significant variation in the proficiency scores among the 24 chess teams in the study, ignoring the states that contain them? What is the P-value?

(ii) Analyze the appropriate ANOVA model taking into account both teams and states. Is there significant variation in the scores by state? Is there significant variation of MCF ratings by teams within states? What are the P-values in each case? What are the degrees of freedom of the two F statistics involved? (Hint: See NestedBatches.sas on the Math439 Web site. That example had data that was balanced in the sense that each level of the outer factor had the same number of levels of the nested factor, but that is not necessary for the analysis.)

(iii) What are the MSS (Mean Sum of Squares) values for within-team variation, between-state variation, and variation between teams within states? Are these consistent with your answers to part (ii)? How are the F-statistics in part (ii) computed in terms of these MSS values?

(iv) Is there significant variation in the scores by state, ignoring any team structure within each state? (That is, as if all the players from a state were on the same team.) What is the P-value? Why is this P-value different from the P-value for state in part (ii)?

Top of this page