Math 3200 Assignments

Math 3200 - Spring 2008
Recommend Homeworks, Required Computer HWs, and Exams

Click here for Required Computer Homework

Click here for Prof. Sawyer's home page

Problems from textbook:

Statistics and Data Analysis from Elementary to Intermediate

A. J. Tamhane and D. D. Dunlop, Prentice-Hall, 2000, ISBN 978-0137-44426-7

Recommended (By-Hand) Homework Assignments:
   (HW1 due Jan 23) Jan 14 - Chapter 2 - Section 2.1 - 6,9,10,11,12,14
   Jan 16 - Chapter 2 - Section 2.2 - 16,17,18,20,22,27
   Jan 18 - Chapter 2 - Section 2.3 - 28,29,30,32,33,34    HW1 Answers
   Jan 21 - Martin Luther King Day
   (HW2 due Jan 28) Jan 23 - Chapter 2 - Section 2.4 - 35,38,40,41,42,46
   Jan 25 - Chapter 2 - Section 2.5 - 48,49,50,52,53,54    HW2 Answers
   (HW3 due Feb 4) Jan 28 - Chapter 2 - Section 2.7 - 59,60,61,62,64,70
   Jan 30 - Chapter 2 - Section 2.8 - 71,72,73,74,75,76
   Feb 01 - Chapter 2 - Section 2.9 - 78,79,80,81,82,83    HW3 Answers
   (HW4 due Feb 11) Feb 04 - Chapter 3 - Section 3.1 - 1,2,3,4,5,6
   Feb 05 - FIRST EXAMINATION
   Feb 06 - Chapter 3 - Section 3.2 - 7,8,9,10,11
   Feb 08 - Chapter 3 - Section 3.3 - 12,14,15,16,17,18    HW4 Answers
   (HW5 due Feb 18) Feb 11 - Chapter 3 - Section 3.4 - 20,21,22,23,24,26
   Feb 13 - Chapter 4 - Sections 4.1,4.2 - 2,3,4,5,6,8
   Feb 15 - Chapter 4 - Section 4.3 - 9,10,11,12,14         HW5 Answers
   (HW6 due Feb 25) Feb 18 - Chapter 4 - Section 4.4 - 26, 30,31,33,34,38
   Feb 20 - Chapter 5 - Section 5.1 - 1,4,6,7,8
   Feb 22 - Chapter 5 - Section 5.2 - 16,18,19,20,22,23    HW6 Answers    HW6a Answers
   (HW7 due Mar 3) Feb 25 - Chapter 5 - Sections 5.3,5.4 - 24,25,26,29,30,32
   Feb 27 - Chapter 6 - Section 6.1 - 1,2,3,4,7,8
   Feb 29 - Chapter 6 - Section 6.2 - 11,12,13,14,15,16    HW7 Answers
   (HW8 due Mar 17) Mar 03 - Chapter 6 - Section 6.3 - 17,18,20,22,24,30
   Mar 04 - SECOND EXAMINATION
   Mar 05 - Chapter 7 - Sections 7.1,7.2 - 1,7,8,12,13,16
   Mar 07 - Chapter 7 - Sections 7.3,7.4 - 17,18,19,20,21,22    HW8 Answers
   Mar 10 - Spring Break
   Mar 12 - Spring Break
   Mar 14 - Spring Break
   (HW9 due Mar 26) Mar 17 - Chapter 8 - Sections 8.1,8.2 - 1,2,3,6,7,8
   Mar 19 - Chapter 8 - Sections 8.3,8.4 - 9,10,13,16,18,20
   Mar 24 - Chapter 9 - Sections 9.1,9.2 - 5,6,8,11,14,16    HW9 Answers
   (HW10 due Apr 04) Mar 28 - Chapter 9 - Sections 9.3,9.4 - 17,20,22,27,28,32
   Apr 02 - Chapter 10 - Sections 10.1,10.2 - 2,4,5,6,7,8
   Apr 04 - Chapter 10 - Sections 10.3,10.4 - 9,10,15,16,20,24    HW10 Answers
   (HW11 due Apr 14) Apr 07 - Chapter 10 - Section 10.5 - 28,29,30,31,32,34
   Apr 08 - THIRD EXAMINATION
   Apr 09 - Chapter 11 - Sections 11.1,11.2,11.4 - 2,3,4,11,12,17
   Apr 11 - Chapter 11 - Sections 11.5,11.6 - 22,23,28,30,34,37    HW11 Answers
   (HW12 due Apr 21) Apr 14 - Chapter 11 - Section 11.7 - 40,41,42,44,45,46
   Apr 16 - Chapter 12 - Section 12.1,12.2 - 1,2,5,7,10,12
   Apr 18 - Chapter 12 - Sections 12.3,12.4 - 18,19,20,21,22,28    HW12 Answers
   (HW13 due Apr 28) Apr 21 - Chapter 13 - Sections 13.1 - 2,3,6,7,9,13
   Apr 23 - Chapter 14 - Sections 14.1,14.2 - 1,2,4,12,13,16
     (Sections 14.3-14.4 will not be covered on the Final)    HW13 Answers
   Apr 25 - Chapter 14 - Sections 14.3,14.4 - 19,20,21,23,24,25
   Apr 28 - Reading Period
   Apr 30 - Reading Period
   May 02 - Final Examination (10:30 AM - 12:30 PM)

Required Computer Homework Assignments:
All assignments are to be done using SAS.

  CHW1 due Feb 20    -    Computer HW1

  CHW2 due Feb 27    -    Problems 4.12, 4.26, 5.2 - (SAS hints for Computer HW2)

  CHW3 due Mar 03    -    Problems 5.22, 5.29, 6.7 - (SAS hints for Computer HW3)

  CHW4 due Mar 17    -    Computer HW4

  CHW5 due Mar 24    -    Computer HW5

  CHW6 due Mar 31    -    Computer HW6

  CHW7 due Apr 07    -    Computer HW7

  CHW8 due Apr 14    -    Computer HW8

  CHW9 due Apr 21    -    Computer HW9

CHW1 due Feb 20 -	Computer HW1
CHW2 due Feb 27 -	Problems 4.12, 4.26, 5.2 -	(SAS hints for Computer HW2)
CHW3 due Mar 03 -	Problems 5.22, 5.29, 6.7 -	(SAS hints for Computer HW3)
CHW4 due Mar 17 -	Computer HW4
CHW5 due Mar 24 -	Computer HW5
CHW6 due Mar 31 -	Computer HW6
CHW7 due Apr 07 -	Computer HW7
CHW8 due Apr 14 -	Computer HW8
CHW9 due Apr 21 -	Computer HW9

Computer Homeworks:

NOTE: See How to Format Computer Homework on the main Math3200 Web page for how Computer Homeworks should be formatted.

Note: `le' means less than or equal to. `ge' means greater than or equal to.

Computer HW1 due Wednesday Feb 20 by 4:45 PM:

See Suggestions for HOW TO ORGANIZE your answers for Computer HW1 below.
See How to Format Computer Homework in general. (This is on the main Math3200 Web page.)

1. (a) Write a SAS program that generates 100 random variables X_i that are uniformly distributed between 0 and 1. Use SAS to find the mean and standard deviation of the sample of 100 r.v.s. How do they compare to the theoretical mean and standard deviation of 100 random variables with that distribution?
   Use the X_i to generate random integers Y_i with a discrete uniform distribution P(Y_i=k)=1/10 for k=1,2,...,10. Is the sample distribution of the 100 random integers close to 1/10 for each k with 1 le k le 10?
   (b) Do the same as in part (a) with 10,000 r.v.s instead of 100. Do the results appear to improve?
   (Hints: The function floor(x) in SAS (and many other computer languages) returns the greatest integer m le x. If X is U(0,1), then Y=1+floor(10*X) satisfies Prob(Y=k)=1/10 for k=1,2,...,10. Use proc means to keep track of X_i and proc freq to keep track of Y_i. See randlist.sas and SRSexamp.sas on the Math3200 Web site for examples of the use of proc means and proc freq.)

2. In a shaft and bearing assembly in a factory, the diameters of the bearings, X, are normally distributed with mean = 0.526 inches and standard deviation = 0.0035 inches. The diameters of the shafts, Y, are normally distributed with mean = 0.525 inches and standard deviation = 0.0043 inches. An engineer who works at the factory, who had read Section 2.9 of our textbook, said that the probability that the shaft would fit inside the bearing (P(Y < X)) is 0.5716.
Test the engineer's assertion by using SAS to generate 10,000 pairs of independent normally distributed r.v.s X_i and Y_i with the specified means and standard deviations and count the proportion of pairs for which Y_i < X_i. What results do you obtain? Do the results of this simulation appear to be consistent with the engineer's statement, within the limits of sampling error?
(Hints: Use the SAS help pages to find the syntax for using the rand function to generate independent normal r.v.s with mean mu and standard deviation sigma. Specifically, look up either the RAND function or streaminit in the SAS help and documentation pages under Help on the SAS Main Menubar. Look under Index, but Search should also work. Once you generate independent random normal X_i and Y_i with the proper means and standard deviations, set K_i=1 if Y_i < X_i and K_i=0 otherwise. You can define K_i by either an ``if-then-else'' construct or by setting K_i=(Y_i < X_i); SAS (and other computer languages) evaluate a true expression as the number 1 and a false expression as the number 0. Since K is discrete, it can be tabulated as in Problem 1.)

3. As in Problem 3.23 in the text, three pretzel workers are each to use each of three different methods of formulating pretzels (a hard pastry) by making use of nine preformulated blocks of dough. Each block of dough is to be made into 50 pretzels and then baked, for a total of 450 pretzels. It is important that not only the order of the three methods be randomized for each worker, but also the order of times that the nine batches of pretzels are baked in the same oven be randomized.
Use SAS to define a randomized schedule giving the order of the nine procedures to be carried out.
(Hints: There are many ways of doing this. One is to make up nine text strings of the form that Worker#X should use Method#N for X,N=1,2,3, randomly permute the nine text messages, and then display them. See e.g. SRSexamp.sas on the Math3200 Web.)

HOW TO ORGANIZE your answers for Computer HW1:

     Organize the homework that you hand in into three parts:
     Part (I): Your answer to all three questions in your own words
     Part (II): All of your SAS programs together
     Part (III): All of your your SAS output
     See How to Format Computer Homework for more detail. (This is on the main Math3200 Web page.)

For Computer HW1, your answers in Part (I) should be something like
     Problem 1: For N=100, the mean and standard deviation of the 100 X values and what their theoretical value are. (You could also use X_i as the variable name if you like: X_i may be clearer in context than X.) Then comment about whether the mean and standard of the 100 values seem close to the theoretical values within sampling error. Then have a table with the counts of the 100 Y (or Y_i) values for various k=1,2,3,4,5,6,7,8,9,10 and their percentages as well, or else refer to an explicit page number in Part (III) of your homework that has this information. Comment about whether the percentages seems close to their theoretical values within sampling error. Do all of the above again for N=10,000. Comment about whether you see an improvement in fit to the theoretical values.
     NOTE: DO NOT PRINT OUT any SAS dataset with N=10,000 values! This will produce approximately 130 pages of output that no one will ever look at. You do not need this output for any sane reason. Deleting the `proc print' statement that displays them will NOT AFFECT anything else in the program unless you use some very exotic `proc print' options.
     In general, the sample SAS programs have a `proc print' statement after each SAS data step to make sure that the constructed SAS data set is what it should be. After you check that the dataset looks reasonable, you should delete these `proc print' statements unless their output is extremely short (or unless the output is moderately short and you want to keep them for reassurance). Your SAS output should never be more than a few pages long unless you are analyzing a very complicated problem. (This will not come up in Math3200, but might if you get a job using SAS or in more advanced Stat courses.)
     NOTE: In general, you can use `proc means' to calculate the Mean and Standard Deviation of a column in a SAS data set. See randlist.sas and randlist.list on the Math3200 Web site for an example. Similarly, you can use `proc freq' to construct a table of values for a discrete variable. See SRSexamp.sas and SRSexamp.lst on the Math3200 Web site for an example of such a table. `Proc means' and `proc freq' are two of the most commonly-used SAS routines.
     Problem 2: To answer this question, write down the fraction of times that Y<X (or Y_i<X_i) in N=10,000 random simulations. Compare this fraction with the asserted answer of 0.5716.
     AN HISTORICAL NOTE: As you might have guessed, this problem is very similar to Problem #18 on the Exam 1. Once you get using to SAS or any computer language, this simulation might be considered an easier way to do this problem. My impression is that, at least during the 1950s and 1960s, most U.S. engineers would not have been able to do Problem #18 theoretically but could easily do a simulation. During the Cold War, many defense industries liked to hire perhaps one mathematician for every 10 engineers.
     SUGGESTION FOR DOING PROBLEM 2: Run this program with a `proc print' statement with N=10 records to see if (a) the X,Y values look like they might be normally distributed random variables with those means and variances and (b) the K values are such that K=1 if Y<X and K=0 otherwise. The proportion of K=1 values can be found from a `proc freq' statement, but will not be very accurate for N=10.
     Then change N=10 to N=10,000, DELETE THE `PROC PRINT' STATEMENT, and run the SAS program again for a more accurate result.
     Problem 3: To answer this problem, write down the randomized time order of the nine experimental pretzel runs. If you must (a lazier response but still OK), refer to Part (III) to a `proc print' display with the same information after explaining what the permuted strings mean. For a more complete answer, comment about exactly what the experimenter should tell the three individual pretzel workers to do in such a way that they don't get annoyed and get jobs at another pretzel company.

SAS Hints for Computer HW2 due Wednesday Feb 27 by 4:45 PM:

(Problems 4.12, 4.26, 5.2)
See How to Format Computer Homework on the main Math3200 Web page.

SAS hints for CHW2:
     (i) SAS's proc univariate; var xx; run; displays a huge number of single-sample statistics, including stem-leaf and box plots. Note however that SAS's quantiles are overly rounded. You can get text normal plots by saying proc univariate normal plot; var xx; run; and high-resolution normal plots by proc univariate; probplot xx / normal; run;
    WARNING: SAS's normal plots have (X,Y) reflected in comparison with the normal plots in the text. In particular, the appearance of outliers and light tails of distributions is reversed.
    FOR EXAMPLE, outliers to the left appear in SAS's (or STATA's) normal plots on the left and ABOVE a straight line defined by the center of the distribution in SAS normal plots, as opposed to BELOW the straight line on textbook normal plots, with the reverse for outliers on the upper tail.
     (ii) To enter space-separated numbers without regard for different lines into a single SAS variable, enter (for example)
     data mydata; input xx @@;
     datalines;
     14 25 32 17 99 12 221
     114 2 57 577 14 11 14 -7
     run;
This reads 15 numbers into a single column in the SAS dataset mydata. Normally SAS reads one line, copies numbers into one or more variables, then ignores the rest of the line. The ``trailing @@'' in this case tells SAS to, instead, read one word at a time and to ignore line structure. You still need a final run; on its own line, however.
     (iii) See Problem 1, CHW1, for hints about generating discrete uniform integer-valued r.v.s.
     (iv) To make SAS's proc chart use integer values for its histograms rather than split up the range arbitrarily, try either
     proc chart;   vbar xx / discrete;   run;
or
     proc chart;   vbar xx / midpoints=1 to 6 by 0.50;   run;
The first syntax tells SAS that xx is a discrete variable and it should use the discrete values for histogram blocks.
The second syntax tells SAS to use the histogram intervals that you want SAS to use, not to split up the range into (for example) seven equally-spaced intervals.

SAS Hints for Computer HW3 due Monday Mar 3 by 4:45 PM:

(Problems 5.22, 5.29, 6.7)
See How to Format Computer Homework on the main Math3200 Web page.

SAS hints for CHW3:
     (i) You can use rand('normal') to return a random N(0,1) and rand('normal',mu,sigma) to return a random N(mu,sigma^2)
     (ii) Use SAS's function finv to return F-distribution quantiles and compare with Table A.6. (See Samptt.sas on the Math3200 Sample SAS programs Web site.) Remember than finv() returns quantiles while Table A.6 has critical values, so don't forget to convert. Alternatively, use simulation to check the Table A.6 values. (Do this one way or the other, but you needn't do both ways. The second way requires more programming.)
     (iii) If you say proc means Mean Stddev Stderr; then SAS will print the SEM as well as the data sample standard deviation.

Computer HW4 due Monday Mar 17 by 4:45 PM:
See How to Format Computer Homework in general. (This is on the main Math3200 Web page.)

Prob 1. Coverage Probabilities for two Confidence Intervals: The coverage probability of a confidence interval for an unknown parameter mu is the probability that the confidence interval actually contains mu. Thus the coverage probability of a true 95% confidence interval is 0.95, but may differ for an approximate or an incorrect 95% confidence interval.

Simulate the coverage probabilities for two symmetric confidence intervals in the following way. For each of N=1000 independent trials, generate a random sample of 5 independent N(7,5^2) random variables (that is, mu=7 and sigma=5). For each sample of 5, consider (a) the symmetric 95% confidence Z-interval for the true mean based on the sample statistics for Xbar and Sx (with n=5) with the calculated value of Sx assumed known and (b) the symmetric 95% confidence T-interval for the true mean based on Xbar, Sx, and n=5. Estimate the coverage probabilities as the proportion of the two sets of 1000 confidence intervals that contain the value 7. Are either or both of the coverage probabilities close to 95%? If not, which are approximately 95%? Is this what you would have expected?

Probs 2 and 3. Problems 7.13 and 7.16. Also, find the P-values in both problems.

Hints: Prob 1:: Note that you will have to generate 1000*5 independent N(7,5^2)s. You can EITHER (I) Within a ``do'' loop of size N=1000, generate five N(7,5^2)s, set XZ=1 if the corresponding Z-interval contains the value 7 (otherwise XZ=99), and set XT=1 if the T-interval contains 7 (otherwise XT=99). Then use proc freq; with tables XZ XT; to find the sample proportions. ALTERNATIVELY, you can (II) generate 5000 N(7,5^2)s in 1000 sets of 5 random normals, with each set of 5 having the same value of an index varible i. Then say ``proc means noprint data=mydata; var xx; by i; out=myoutdata Mean=Xbar std=Stddev n=n; run;'' to write a second dataset `myoutdata' with 1000 rows, each of which has Xbar and Stddev (Sx) for a set of 5 random variables. (See samptt.sas on the Math3200 Web site for a similar use of ``proc means noprint.... out=...'', which is a clever idea due to Ed Spitznagel.) Then re-open `myoutdata' to add values for XZ and XT to the dataset and proceed as in (I). (If you really want to become a SAS expert, you should do it both ways, but you should only hand in one.)
WARNING: DO NOT PRINT OUT the samples for N=1000, which will take approximately 88 printed pages. Good programming practice would be to first write the code with N=10, then print out the N=10 samples, then DELETE THE PROC PRINT STATEMENT and change N=10 to N=1000.

Probs 2 and 3: Consider using SAS's proc ttest. By default, proc ttest; var xx; run; does a two sided test for H0:mu=0. Say proc ttest H0=Muval; ... to test for H0:Muval. (You can also use proc means.) Note that the lower limit of a one-sided lower 95% confidence interval for the mean is the same as the lower limit of a symmetric 90% confidence interval, which you can get by saying proc ttest ... alpha=0.10; .... Warning: Problem 7.13(b) asks for a one-sided P-value, which you may have to convert from a two-sided P-value.

Computer HW5 due Monday Mar 24 by 4:45 PM:
See How to Format Computer Homework the main Math3200 Web page.

Problems 8.8, 8.13, 8.21 (see following)

Problem 8.8: Also find the two-sided P-value for a difference in means. (Hint: The `proc ttest' syntax for a matched-pair design is proc ttest data=mydata; paired yy*xx; run;. See the SAS Help and Documentation for proc ttest;.)

Problem 8.13: Hints: The `proc ttest' syntax for two independent samples is proc ttest; class group; var zz; run;. WARNING: The CI in the output for the mean after Diff(1-2) is the CI using the pooled variance assuming equal variances and Std Err is the SEM using the pooled variance. Since the sample sizes are equal in this case, the pooled and unpooled variances are the same. The pooled and unpooled SEMs (standard errors) are also the same, but the degrees of freedom may differ. Find the two CIs from Xbar-Ybar and SEM(Diff(1-2)) in the output using the two appropriate critical values.

Problem 8.21: Do (a) and (b): (a): Instead of a 90% confidence interval for the ratio of variances sigma_1^2/sigma_2^2, find the two-sided P-value for H_0:sigma_1^2=sigma_2^2. In `proc test' output, the Folded F is max(s_1^2/s_2^2, s_2^2/s_1^2) for the two samples and the Folded-F P-value is the correct two-sided P-value for a difference in population variances.

(b) Find the two-sided P-values for H_0:mu1=mu2 using both the equal variances hypothesis (sigma_1^2=sigma_2^2) and allowing the two variances to differ (sigma_1^2 ne sigma_2^2). Do the two P-values differ? Do you accept or reject at alpha=0.05 in both cases? Which of the two methods seem more reasonable in this case?

Computer HW6 due Monday Mar 31 by 4:45 PM:
See How to Format Computer Homework the main Math3200 Web page.

Problems 9.14, 9.18, 9.32 (see following)

Hint: See the sample SAS program SampTables.sas and output on the Math3200 Web site.

On Problem 9.32: ALSO FIND which two cells in the 4x4 table make the largest individual contributions to the overall Pearson chi-square statistic. Is this consistent with what you would have expected for how hair and eye color is distributed in the general population? (Hint: See the last example in SampTables.sas, which shows how to tell SAS to display ```Observed-Expected'' and `(Observed-Expected)^2/Expected'' for each cell in the table. Recall that the overall Pearson chi-square statistic is the sum of the latter for all cells in the table.)

Computer HW7 due Monday Apr 07 by 4:45 PM:
See How to Format Computer Homework the main Math3200 Web page.

Problems 10.4, 10.24, 11.2 (see below)

Hint: See the sample SAS program Samp_Reg1.sas and output on the Math3200 Web site.

Problem 10.4: ALSO FIND the P-value for a two-sided test of H_0:b_1=0. Do you accept or reject at alpha=0.05? What is the degrees of freedom of the associated t-statistic?

Problem 10.24: ALSO FIND the P-value for a two-sided test of H_0:b1=0 and find a symmetric 95% confidence interval for b_1.

Problem 11.2: (Hints: See the last proc reg call in Samp_Reg1.sas. Make sure that your dataset has the correct three columns.)

Computer HW8 due Monday Apr 14 by 4:45 PM:
See How to Format Computer Homework the main Math3200 Web page.

Problems 11.15, 11.34, 11.37

Hints: (i) See the sample SAS program Samp_Reg3.sas on the Math3200 Web site for options that tell SAS to add to output: (a) ``hat values'' for determining influential observations, (b) VIF scores for collinearity, and (c) confidence intervals for the beta-parameter estimates.
(ii) If you enter proc corr data=mydata; var xx yy zz; run;, in SAS, then SAS will provide a 3x3 matrix whose non-diagonal entries have the appropriate correlation coefficients r_{ij} along with P-values for H_0:rho_{ij}=0.

Computer HW9 due Monday Apr 21 by 4:45 PM:
See How to Format Computer Homework the main Math3200 Web page.

Problems 11.44, 12.7cd, 12.26 (see below)

Problem 11.44:
    For the data in Exercise 11.39, introduce dummy variables Oil for Industry=1 and Drug for Industry=2 and consider 4 possible predictors, Profit, Growth, Oil, and Drug. (See the answers for Problem 11.39 on p702 at the back of the book.) Then
    (i) Do a stepwise regression with SAS' default settings of SLENTRY=0.15 and SLSTAY=0.15. (Hint: See the comments and procedures in Apples.sas on the Math3200 Web site.) Which variables are included in the model? In a regression on these variables only, which are significant? What are their P-values?
    (ii) Do a ``best-subsets'' regression with the Mallow C_p criterion. Compare with the results from part (i).

Problem 12.7: Parts (c,d) for post-treatment differences only.

Problem 12.26:
    (i) Have SAS create side-by-side boxplots for the five time values. Do the plasma citrate levels stand out at any particular time? (Note the warning about boxplots below.)
    (ii) Have SAS generate an ANOVA table for a randomized block design with Time as the treatment factor and Person as a blocking factor. Are there significant differences between times? between persons? What are the P-values in both cases?
    (iii) Use both the LSD and the Tukey method to determine which pairs of the five times are significantly different at alpha=0.05. Is this what you would have guessed from the boxplots?

HINTS: (1) To have SAS generate side-by-side boxplots of a variable YY for different values of a variable TYPE, try
    proc boxplot data=mydata;   plot YY*TYPE;   run;
    WARNING: The dataset must be sorted by TYPE. More exactly, all observations with the same values of TYPE must be together. Otherwise, the result may be dozens of very tiny boxplots.
    (2) To have SAS test whether the means of a response variable YY are the same for all levels of a categorical variable Cat, try
    proc glm data=mydata;   class Cat;   model YY=Cat;   run;
See for example OneWay.sas on the Math3200 Web site. The class statement is necessary to tell SAS that Cat is a categorical variable and not a numerical variable on which it should do a simple regression.
If there is also a blocking factor (Bloc) as in a randomized block design, change this to
    proc glm data=mydata;   classes Cat Bloc;   model YY=Cat Bloc;   run;

Top of this page

Click here for Math3200 page

Click here for Prof. Sawyer's home page

Click here for the Mathematics Department home page

Last modified April 26, 2008