Homework #11, Math 320, Spring 2001

Name:____________________________      Section:____

## Math 320 Homework #11 --- Due 4/13

Include your name, section number, and homework number on every page that you hand in. Enter ``Section 1'' for the morning class (10-11AM) and ``Section 2'' for Professor Sawyer's class (12-1PM).

Begin the exposition of your work on this page. If more room is needed, continue on sheets of paper of exactly the same size (8.5 x 11 inches), lined or not as you wish, but not torn from a spiral notebook. You should do your initial work and calculations on a separate sheet of paper before you write up the results to hand in.

Output from Excel must have your name and the homework number in cell A1.

1. (Like problem 10.4 on page 433.) A research group is interested in the development of alcoholism in children who were adopted at birth and how this relates to alcoholism in their biological (not adoptive) parents. Among a group of 55 men who had at least one alcoholic biological parent, ten (10) were judged to be presently alcoholic. These were compared with a group of 78 men whose biological parents were not alcoholic. In the second group, four (4) were judged to be presently alcoholic.

(i) Construct a 2 x 2 contingency table for this study.
(ii) Test the hypothesis that there is a difference in the rates of alcoholism using Pearson's chi-square test. What is the hypothesis H_0? Do you reject at alpha=0.05, using a two-sided test?
(iii) What is the P-value?
(iv) What is the phi coefficient? What does the sign of phi indicate in terms of the relative frequency of alcoholism in the two groups?

2. Do problem 10.14 on page 447. Note that the output on page 447 has information that could be put in three different 3x4 tables:

(a) the observed data itself,
(b) the ``expected values'':  that is, the cell counts that would be expected if row frequency proportions were identically the same for all columns, or equivalently if column frequency ratios were identically the same in all rows, and
(c) the 12 contributions to the Pearson chi-square statistic.

3. A political pollster conducts a poll to test voter opinions about his candidate among 6 different groups of voters. The results were

``` Group:          A     B     C     D     E     F  |   SUM
--------------------------------------------------------
Approve        44    21    91    81    37    28  |   302
Disapprove     48    24   119   152    53    27  |   423
No opinion     29    28    57    47    29    21  |   211
--------------------------------------------------------
SUM:          121    73   267   280   119    76  |   936  ```
(i) Use Excel to test whether there is a difference in these ratings among the six classes of voters, and which cells (if any) may be responsible for any significance.
That is, use Excel to (ia,b) construct two other 3x6 tables as in items (b) and (c) in the last problem, one with ``expected values'' and one with the values (Obs-Exp)2/Exp for each cell and (ic) to find the Pearson chi-square statistic for the table as the sum of the entries in the second 3x6 table.
Do you accept or reject H0 at alpha=0.01? What is the P-value? How many degrees of freedom did you use? (The easiest way to find the P-value is within Excel, but you could also use a TI-83.)
(Hint: See the sample `Contingency Tables` spreadsheet on the `Example Spreadsheets` page on the Math320 Web site.)
(ii) Referring to the 3x6 tables in your spreadsheet, which cell or cells appear to be the most out of balance? Recall that the entries (Obs-Exp)2/Exp should be approximately the square of a standard normal given H0, so that any terms that are greater than 4 may contribute to the significance of the entire table.
(iii) What do you think that the candidate should try to do with respect to the corresponding group or groups of voters?
(Remark: Instead of building a table with values X= (Obs-Exp)2/Exp for each cell and summing them using the Excel Sum function, you could also build a table with the Zscores Z= (Obs-Exp)/Root(Exp) and combine them using the Excel Sumsq function. The Zscores Z will be approximately standard normal given H0 and the sign of Z tells whether the observed value is higher or lower than expected.)

4. Do exercise 10.24 on page 458.

5. The heights and weights of 10 students are

``` Student#:    1    2    3    4    5    6    7    8    9   10
------------------------------------------------------------
Height:     75   57   63   70   59   69   70   74   69   67
Weight:    134  100  134  158  104  119  116  138  118  129
------------------------------------------------------------```
(i) Find the sample correlation coefficient between height and weight for the 10 students. Show how you found this number.
(ii) Is height and weight significantly correlated in this data? Assume that height and weight are both normally distributed and test the hypothesis H0 that height and weight are independent. How many degrees of freedom did you use in your test?

6. The heights and weights of a group of 73 students was found to satisfy

``` Sum(X)  =   5021       Sum(X^2) = 347873
Sum(Y)  =   9464       Sum(Y^2) =1256309
Sum(XY) = 655733        ```
where Xi and Yi refer to the height and weight of the ith student, respectively.
(i) Find the sample correlation coefficient between height and weight for the 73 students.
(ii) Is height and weight significantly correlated in this data? Assume that height and weight are both normally distributed and test the hypothesis H0 that height and weight are independent. How many degrees of freedom did you use in your test?
(iii) Use the Fisher z-transformation to find a 95% confidence interval for the population Pearson correlation coefficient r (see text page 477-478). (Hint: Use a calculator and the transformations w=(1/2)ln((1+r)/(1-r)) and r=(exp(2w)-1)/(exp(2w)+1) rather than the table in Figure 11.7.)