Math 408 Homework 1 - Spring 2007

  • Click here for Math408 homework page

    HOMEWORK #1 due Tuesday Feb 6

    Text references are to Hollander and Wolfe,
      ``Nonparametric Statistical Methods'', 2nd ed.

    NOTE: In the following, ^ means superscript, _ (underscore) means subscript, and Sum(i=1,9) means the sum for i=1 to 9.

    IN THE FOLLOWING:   Do Problems 1-4 by hand. Problem 5 asks you to write a computer program.

    1.  Consider the data in Table 3.7, p71 of the text about clotting times before and after the administration of 600mg of aspirin.

    (i) Use a Student-t-test to test the hypothesis that there is no difference in the before and after clotting times. Is the resulting P-value significant (P<0.05)? highly significant (P<0.01)?
    (ii) Use the sign test to test the same hypothesis. Obtain two-sided P-values using (a) the exact distribution of the binomial in Table A2 and (b) the normal approximation. Is the resulting P-value significant (P<0.05)? highly significant (P<0.01)? Is it the same as in part (i)?
    (iii) Use the Wilcoxon signed rank test to test the same hypothesis. Obtain two-sided P-values using (a) the exact distribution of the signed rank statistic in Table A4 and (b) the normal approximation. Is the resulting P-value significant (P<0.05)? highly significant (P<0.01)? Is it the same as in parts (i) and (ii)?
    (iv) Which of the three tests would you consider most reasonable? Why?

    2.  Consider the data in Table 3.11, p83 of the text about levels of 6-beta-hydrocortisol excreted by chemical company workers.

    (i) Use the sign test to test the hypothesis that the median amounts per dat can be distinguished from 175 micrograms. Obtain two-sided P-values using (a) the exact distribution of the binomial in Table A2 and (b) the normal approximation. (Hint: Subtract 175 from each of the observations and see if the differences can be distinguished from zero.)
    (ii) Find the nonparametric estimate of the median amount per day using the Hodges-Lehmann sign-test estimator described in Section 3.5.
    (iii) Find a (1-alpha)times 100% confidence interval for the median amount per day, using the sign test-associated confidence interval described in Section 3.6, where alpha is chosen so that 1-alpha is as close as possible to 0.95. What is the size (as in 95%) of the resulting confidence interval?
    (iv) A company executive states that while about as many values in the original data are larger than 175mug/day, the differences from 175mug/day seem to be larger for the positive values. As an alternative, find the nonparametric Hodges-Lehmann estimator of the median based on the Wilcoxon sign test described in Section 3.2. Is the resulting estimator larger than in part (ii)?
    (v) Also find the associated (1-alpha)times 100% confidence interval for the median amount per day, using the Wilcoxon signed rank-associated confidence interval described in Section 3.3, where alpha is chosen so that 1-alpha is as close as possible to 0.95. What is the size (as in 95%) of the resulting confidence interval?

    3.  It is conjectured that tropical plants of a certain genus tend to produce more flowers at higher altitudes than at lower altitudes. Fifteen species in this genus are known to occur at both altitudes in a particular country. To test the conjecture, one plant from a lowland forest and one plant from higher altitudes were collected from each of twelve species from this genus. The number of flowers on each plant were counted, and the results were:

         Species LowAlt HighAlt       Species LowAlt HighAlt
           1        4     10            7        4     14       
           2       11      3            8        7      4       
           3        7     10            9       15      3       
           4       17     17           10        7      7       
           5        5     19           11        3     17       
           6        4     12           12        7     10       
     
    (i) Use the Wilcoxon signed rank test to test whether or not plants from higher altitude tend to have more (or fewer) flowers than plants from lower altitudes. What is the value of the Wilcoxon statistic T^+? What is the associated (two-sided) P-value? Use both (a) the tables and (b) the normal approximation. (Be sure to handle ties correctly. Recall that ties between nonzero absolute values are ignored when using the table.)
    (ii) Even though the data is from 24 different plants, why would it be incorrect to assume that the plants from the lowlands and the plants from higher altitude form two independent samples?

    4.  Change the value of X_3 in Table 3.1 on p39 of the text (Hamilton Depression Scale Factor values) from 1.62 to 16.2.

    What effect does this have on the value of Zbar=(1/9)Sum(i=1,9) Z_i for Z_i=Y_i-X_i? (That is, compare the values of Zbar before and after the change.) What effect does this have on the value of the Hodges-Lehmann estimator thetahat based on the Wilcoxon signed-rank statistic? (See Example 3.3 on page 52.) Which estimator seems to be more strongly affected by outliers?

    5.  Write a short program in C (or a C-like computer language) based on the data salary data in Table 3.2 (p41) of the text that

    (i) Includes the salaries in dollars for the 12 private-sector individuals and for the 12 government-sector individuals as initial values of two global arrays, so that (for example) xval[0]=12500 and yval[0]=11750,
    (ii) Computes the 12 salary differences and stores them in a third global array zval[] whose values are initially zero.  (For example, as defined by  double zval[20]; before the main() function.)
    (iii) Computes and displays (a) the sample mean Zbar , (b) the sample standard deviation, ss , (c) the sample standard error of the mean, stderr , and (d) the one-sample t-statistic T=Zbar/stderr, for the 12 salary differences in zval[].
    (iv) If the salary differences Z_i were normally distributed with mean mu and variance sigma^2 , then, given H_0:mu=0 , the statistic T in part (iii) would have a Student- t distribution with 11 degrees of freedom. Assuming instead that T has a normal distribution with mean zero and variance one given H_0 , find the two-sided P-value for H_0:mu=0 versus H_1:mu is not 0.
    (v) The text on page 41 concludes that the two-sided P-value P=0.078 using the Wilcoxon signed-rank test. How does this compare with the value that you computed in part (iv)? (The answer to this part need not be part of your computer program.)
    (Hint: See sample C programs on the Math408 Web site.)

  • Top of this page