Math 475 Takehome Final

Math 475 Takehome Final - Fall 2005

Click here for Prof. Sawyer's home page

TAKEHOME FINAL due on or before Wed 12-21 by 3 P.M.
(Return to Prof. Sawyer or to math receptionist in Cupples I Room 100.)

NOTE: There should be NO COLLABORATION on the takehome final,
other than for the mechanics of using the computer.

Open textbook and notes (including course handouts).
In general where the results of a statistical test are asked for,
(i) EXPLAIN CLEARLY what the hypotheses H₀ is and what alternative you are testing against,
(ii) find the P-value for the test indicated (and state what test you used), and
(iii) state whether the results are significant (P<0.05), highly significant (P<0.01), or not significant (P >= 0.05). If the P-value is based on a Student's t or Chi-square or F distribution, also give the degrees of freedom. (WARNING: An F distribution has TWO degrees of freedom, one for the numerator and one for the denominator.)

ORGANIZE YOUR WORK in the following manner:

(i) your answers to all questions,

(ii) all your SAS programs, and

(iii) all your SAS output.
ADD CONSECUTIVE PAGE NUMBERS to part (iii) of your homework so that you can make references from part (i) to part (iii). For example, so that you can say things like, ``The answer in part (a) is 57.75. The scatterplot for part (b) is on page #Y below.'' It may be clearest to write page numbers yourself on the SAS output.

Different parts of problems may not be equally weighted.
5 problems.

Problem 1. Heights and weights for the employees of Vaporware Computer Services are recorded in Table 1. Each table entry has the height, weight, and sex for one employee, in that order. The employees of this company are known to be unusual.

     Table 1 --- Heights and Weights for Employees

     68  154  M       62   89  F       62   86  F       59  117  F
     71  125  M       63   81  F       70  137  M       60   88  F
     61   89  F       62  121  F       58  134  F       60   96  F
     65   90  F       67   85  F       67   96  F       78  122  M
     64   86  F       71  164  M       60   87  F       64  117  F
     72  162  M       65   84  F       73  129  M       70  143  M
     71  149  M       63  114  F       59  110  F       67  125  M
     64   80  F       74  156  M       59  108  F       64  120  F
     68  162  M       60  136  F       69  155  M       77  120  M
     60  109  F       72  134  M       56  140  F       75  132  M
     73  117  M       68   75  F       62  102  F       60  104  F
     67  134  M       60  107  F       76  137  M       73  124  M
     59   85  F       63   83  F       60  120  F       63   85  F
     64  127  F       73  133  M       69  119  M       72  120  M
     63   92  F       57  141  F       77  138  M       61  120  F
     68  134  M       59  102  F       58  121  F       77  110  M
     76  127  M       56  135  F       69  154  M       56  116  F
     59   89  F       70  164  M       73  129  M       76  117  M
     67   80  F       68  131  M

(i) For the employees grouped into two samples by sex, what are the two sample sizes? What are the two sample means for height?

Is there a significant difference in height between the two sexes? Use SAS to find out. What is the value of the t-statistic for the classical two-sample t-test? What is the P-value? What is the number of degrees of freedom of the t statistic?

The classical t-test assumes that the variances of the two samples are the same. Is this a reasonable assumption here? Why? What is a P-value for a hypothesis based on this assumption? Does this P-value mean that it is safe to assume that the variances are the same, or the opposite?

(ii) What is the Pearson correlation coefficient between height and weight for the individuals in Table 1, ignoring sex? Is it significantly different from zero? What is the P-value? Is this a one-sided or a two-sided P-value, in this sense that it tests for rho being either less than zero or greater than zero?

How was the P-value for the Pearson correlation coefficient calculated? What is the number of degrees of freedom of the test statistic that is used to calculate the P-value? What is the formula that expresses this test statistic in terms of rho?

(iii) What are are the Pearson correlation coefficients between height and weight for employees within each sex? Are they significant? Do they have the same sign as the correlation coefficient in part~(iii)? How can the correlation coefficients have one sign within groups but a different sign for the two groups combined? Construct a height by weight scatter plot using sex as the plotting symbol to illustrate your answer.

Problem 2. A marketing company carries out a survey to test whether consumer evaluations of a product depend on gender. The results of a large-scale study in three cities are shown in Table 2.

  Table 2 - Consumer Impressions in Three Cities

  City:          City1           City2           City3
  Opinion:       Y    N          Y    N          Y    N
  Female       155  298        328  149        373  424
  Male          71  162        599  328        125  185

(i) Based on this data, for all three locations together, is the product viewed differently by females than by males? (Carry out an appropriate two-sided test, which means that it should test for either more favorably or less favorably.) WARNING: Since row and column proportions of the four tables are NOT the same, combining the 2 by 2 tables into a single 2 by 2 table might lead to Simpson's Paradox.

What test did you use? Is the P-value one-sided or a two-sided? (That is, is it also sensitive to the possibility that females might view the product less favorably overall?)

(ii) Are females more likely or less likely to like the product within each of the cities? What are the phi coefficients for the three contingency tables? Does phi>0 mean that females view the product more favorably than males, or less favorably?

(iii) Now aggregate the data (possibly incorrectly) into a single 2 by 2 table. How do your conclusions differ? What is the P-value for this (possibly incorrect) 2 by 2 table? What is the new phi coefficient? Is the sign of the new phi coefficient the same as at the individual locations? Are the women in the aggregated table relatively more favorable to the product in comparison with men than in the individual locations, relatively less favorable, or the same?

Problem 3. An engineer is interested in the resonant frequency of a mechanical device as a function of three variables: Pressure, with three levels (Press1,Press2,Press3), Drubness, with two levels (Drub1,Drub2), and Abrasiveness, with three levels (Abr1,Abr2,Abr3). The resonant frequencies of two devices are measured for each set of levels of the three variables. The resulting frequencies are listed in Table 3.

 Table 3. Frequencies of a Device

              Press1                Press2                Press3
          Drub1    Drub2       Drub1     Drub2         Drub1     Drub2
Abr1   3839 3202  326 117    5950 1254  357 1550     484  227  1915 2924
Abr2   1313 3202  276 368    1574 8814  530  538    1046 1128  1373 2795
Abr3   2097 6417  374 429    3614 1293  238 2476     201  886  1803 1647

(i) Use SAS to run a full factorial model with the three variables as three factors. Is the Model Test significant? What is its P-value?

(ii) Plot the residuals against the predicted values in the model. In order to get a better idea of the distribution of the residuals, include the level of Pressure in the residual plot as the plotting symbol. (Make sure that the plotting symbol identifies the Pressure level!)

Do the residuals appear to be independent of the predicted value and of the value of Pressure? Why? Do the residuals appear to be normally distributed? Carry out a test that provides a P-value for the normality of the residuals.

(iii) Run the full factorial model again with the values in Table 3 replaced by their logarithms. Is the Model test now more significant? Analyze the residuals of the log-transformed data in the same way as in part (ii). Do they now look more independent of the predicted value and of the value of Pressure?

(iv) Which of the main effects of Abrasiveness, Drubness, and Pressure are significant for the log-transformed data? highly significant? Which of the four interactions are significant? highly significant? What are the P-values for the significant effects? For the effects that are significant, what are the degrees of freedom for the F-tests involved, numerator and denominator?

(v) How were the F-statistics calculated for the tests in part (iv) that were significant? What was the denominator? Is the denominator of the F-statistics in the output?

(vi) For each of the two-way interactions that are significant, display an interaction plot. For each such interaction, is the interaction visible in the interaction plot? What can you conclude about the interaction and how it effects the dependent variable (that is, the resonant frequency of the device)?

Problem 4. An international baseball organization conducts a survey to compare the throwing expertise of catchers in a sample of Little League teams distributed among 4 Leagues. Proficiency scores for making an accurate throw from home to second base were made for 3 catchers on each team. The international organization want to know where most of the variation of catcher throwing skills is located: among teams within leagues, between leagues, or a combination of both. The survey data is in Table 4.

 Table 4 --- Catcher throwing proficiencies by Team and League
 League1
    Team1  71  68  75    Team2  52  57  63    Team3  74  67  78
    Team4  76  91  71
 League2
    Team1  56  54  57    Team2  70  66  64    Team3  71  62  62
 League3
    Team1  70  50  64    Team2  59  61  74    Team3  53  65  57
    Team4  62  59  72    Team5  69  80  65    Team6  56  76  74
    Team7  64  62  49    Team8  61  73  48    Team9  47  57  51
 League4
    Team1  74  78  62    Team2  78  76  73    Team3  64  54  50
    Team4  70  68  66    Team5  65  72  73

Note that ``Team1'' does not refer to the same team in different leagures, which might be in different parts of the world, but only to the first team in that league that happened to send its catcher scores in to the international organization. Treat the three observations for each team as an independent sample for that team.

(i) Using within-team variation to estimate the error, was there significant variation in the proficiency scores over the 15 or more teams in the study, ignoring the leagues that contain them? What is the P-value? What are the degrees of freedom of the resulting F statistic?

(ii) Analyze the appropriate ANOVA model taking into account both teams and leagues. Is there significant variation in the scores by league? Is there significant variation by teams within leagues? What are the P-values in each case? What are the degrees of freedom of the two F statistics involved?

(iii) For the analysis in part (ii), which pairs of leagues differ significantly in terms of catcher scores? Run the appropriate Duncan procedure to find out.

(iv) What are the MSS (Mean Sum of Squares) values for within-team variation, between-league variation, and variation between teams within leagues? Are these consistent with your answers to part (ii)? How are the F-statistics in part (ii) computed in terms of these MSS values?

(v) Is there significant variation in the scores by league, ignoring any team structure within each league? (That is, assume that everybody in a league is on the same team, including perhaps dozens of catchers.) What is the P-value? What are the degrees of freedom of the F statistic? Why is this P-value different from the P-value for league in part (ii)?

Problem 5. After reading the lab notebook and checking with his technicians, the sound engineer in Problem 3 becomes concerned that the second replications in Table 3 may not be reliable. The second replication was done the week after the first replication under less stable conditions. (The first and second replications are the first and second values in each cell in Table 3, respectively, where cells are the possible combinations of Pressure, Drubness, and Abrasiveness.) The engineer wants to discard all of the second replications in Table 3 and only analyze the first replication in each cell. Unfortunately, this leads to data with three factors (Pressure, Drubness, and Abrasiveness) and only one replication per cell.

A friend of the engineer in the forestry service tells the engineer about split-plot analyses. The friend says that the assumptions appear to be satisfied here. The model can be considered to have a major factor (Drubness) and two minor factors (Pressure and Abrasiveness). For engineering reasons, the two minor factors should not be expected to have strongly nonadditive effects, so that the Pressure*Abrasiveness interaction and the three-way interaction are plausible candidates to estimate error. (Note that the sum of these two effects is Pressure*Abrasiveness nested within Drubness.)

(i) Carry out this split-plot analysis for the log-transformed data corresponding to the first replication. Answer the same questions in parts (iii)-(vi) of Problem 3 for the smaller data set and for the effects that can be tested in the split-plot analysis.

Among other things, find the residual plot and test normality, find all effects that are significant, identify the denominator of the F-statistics involved, and display interaction plots for significant interactions. (Warning: Don't forget to use log-transformed data!)

(Hint: If you constructed the SAS data set for Problem 3 with a variable Rep=1 for the first replicaton and Rep=2 for the second replication in each cell, then you can subset the data set using the command if Rep=1;.)

(ii) The residual plot for the subsetted data contains what may be a suspicious value. This value (which has Pressure=Press3) can be identified as the observation with the largest positive residual. Find the the Studentized residual and the CookD statistic for this observation. Is this apparent outlier troublesome, on the basis of a Studentized residual larger than 3.0 or a CookD statistic larger than 1.0?

(Note: A more standard rule for the CookD statistic is whether the CookD value is greater than the median of the F-distribution with parameters F(r,n-r), where n is the number of observations and r is the number of fitted parameters. Using 1.0 instead is a rough approximation.)

Top of the Final