(III) The output from the SAS programs in Part II.
In Part I, you can refer to plots or tables or large matrices that
problems ask for by saying (for example), ``The scatterplot or matrix for
Problem 3 is on page 17 of the SAS output.'' If necessary, add
page numbers to the SAS output, so that (for example) you don't have
several different page 1s in Part III.
1. Assume that X is vector-valued normal N(mu_X,B) for
( 5 ) ( 0 0 0 )
mu_X = ( -3 ) and B = ( 0 2 3 )
( -2 ) ( 0 3 5 )
Consider the random vector Y = A X in R^2 for
A = ( 1 2 3 )
( 0 1 2 )
Thus Y is normal N(mu_Y,C_Y) for some vector mu_Y and matrix C_Y. Find
mu_Y and C_Y.
2. Let X be an n by d matrix whose entries X_ia are independent
N(0,1) for 1 le i le n and 1 le a le d.
(i) Let Y be the random matrix X'X. Show that Y has a Wishart
distribution W_d(n,I_d), in the notation of equation (4.15) in the text.
(Hint: Write out the entries of Y = (Y_ij) or Y = (Y_ab) in terms
of the entries X_{ia} of X. Note that it is sufficient to set mu=0 in
(4.15) in the text.)
(ii) Show that E(Y)=n I_d.
3. Table 5.5 (page 150) in the text has four measurements on m=19
beetles from the flea beetle species Haltica oleracea and from n=20
specimens of another flea-beetle species, H. carduorum. (See also the
data file FleaBeetles.dat
.)
(i) Using the Hotelling T^2 test for all four measurements
y_1,y_2,y_3,y_4, use SAS to test the hypothesis H_0:E(X)=E(W), where
X_1,...,X_m (in R^4) represent the measurements from H. oleracea
and W_1,...,W_n the measurements from the second flea-beetle species. Do
you accept or reject H_0?
(ii) From the output, what is the value of the associated F statistic
for the multivariate test? What is the number of degrees of freedom, both
in the numerator and in the denominator? How were these derived from the
number of components in the observations (d=4) and the sample
sizes (m,n)?
(iii) Carry out two-sample t-tests on the four measurements
y_1,y_2,y_3,y_4 individually. Which of these are significantly different
between the two samples? What are the two-sided P-values?
(Hints: See MLizards.sas
on the Math439 Web site.
Do not log-transform the data. If you use proc format
to
assign descriptive tags to the Species variable (=1,2), make sure that you
use the correct species names.)
(Warning: Make sure that SAS reports that you have
measurements for 39 individuals.)
4. Consider the blood glucose data in Table 3.8 (page 80) of the
text or MPairedSamp.sas
on the Math439 Web site. Let
y_1,y_2,y_3 be the fasting blood levels of each subject and x_1,x_2,x_3 the
levels after one hour, as in Table 3.8. Let z_i=x_i-y_i (i=1,2,3) be the
increases (as in MPairedSamp
).
(i) Use SAS to test H_0:E(z_1)=E(z_2)=E(z_3); that is, that the
increase in blood glucose levels was the same for all three visits across
the subjects. Do you accept or reject H_0?
(Hint: Note that a=b=c is equivalent to b-a=0 and c-a=0. Thus,
you can define W1=Z2-Z1 and W2=Z3-Z1 and then test H_0:E(W)=0 for
W = (W_1,W_2)'. See also the discussion in Section 5.9.1 in
the text.)
(ii) From the output, what is the value of the associated F statistic
for the multivariate test? What is the number of degrees of freedom, both
in the numerator and in the denominator? How were these derived from the
number of components in the observations and the sample size?
Top of this page