TAKEHOME FINAL due on or before Mon 12-20=2010 at 4 PM
NOTE: There should be NO COLLABORATION on the takehome final,
other than for the mechanics of using the computer.
Text references are to the textbook, Cody & Smith, ``Applied statistics and the SAS programming language'', 5th edn
NOTE: See the main Math475 Web page for how to organize a homework
assignment or takehome test using SAS. In particular,
ALWAYS INCLUDE YOUR NAME in a title statement in your SAS
programs, so that your name will appear at the top of each output page.
ALL HOMEWORKS MUST BE ORGANIZED in the following order:
(Part 1) First, your answers to all the problems in the homework,
whether you use SAS for that problem or not. If the problem asks you to
generate a graph or table, refer to the graph or table by page number in
the SAS output (see below). (Xeroxing a page or two from the SAS output or
cutting and pasting into a Word file or TeX source file is also OK.)
(Part 2) Second, all SAS programs that you used to obtain the output for
any of the problems. If possible, similar problems should be done with the
same SAS program. (In other words, write one SAS program for several
problems if that makes things easier, using Better yet would be one SAS
title or title2 statements to separate the problems in
your output.)
(Part 3) Third, all output for all the SAS programs in the previous
step.
If an answer in Part 1 requires a table or a scatterplot that you need to
refer to, make sure that your SAS output has overall increasing (unique)
page numbers and make references to Part 3 by page number, such as
``The scatterplot for Problem 2 part (b) is on page #X in
the SAS output below.'' DO NOT say, ``see Page 3 in the SAS output''
if Part 3 has output from several SAS runs, each of which has its own
Page 3. In that case, either write your own (increasing) page numbers
on the SAS output, or else (for example) refer to ``Page 2-7 in the
SAS output'' (for page 7 in the second set of SAS output) and write
page numbers in the format ``2-7'' at the top of pages in your output.
Different parts of problems may not be equally weighted.
Five (5) problems.
Problem 1. Heights and weights for the employees of VaporLock Software Services are recorded in Table 1. Each table entry has the height, weight, and sex for one employee, in that order. The employees of this company are known to be unusual.
Table 1 --- Height, Weight, Gender for 79 VaporLock Employees
69 149 M 66 189 M 82 134 M 60 144 F
71 113 F 69 98 F 72 179 M 65 198 M
58 147 F 74 83 F 61 125 F 69 191 M
70 98 F 66 98 F 64 198 M 61 117 F
68 191 M 68 105 F 74 137 M 72 181 M
77 129 M 70 145 M 64 126 F 75 132 M
78 139 M 75 149 M 74 138 M 74 135 M
72 80 F 61 114 F 66 113 F 67 160 M
73 150 M 70 115 F 72 91 F 61 90 F
68 79 F 76 149 M 67 94 F 69 90 F
59 104 F 61 118 F 69 86 F 68 95 F
57 134 F 56 139 F 70 180 M 78 165 M
68 114 F 73 88 F 58 124 F 63 121 F
69 174 M 65 126 F 77 128 M 79 136 M
66 92 F 67 136 F 66 123 F 78 149 M
68 139 M 70 94 F 62 105 F 71 117 F
65 112 F 77 148 M 70 177 M 59 125 F
76 179 M 63 139 F 70 97 F 69 88 F
76 170 M 72 143 M 71 143 M 80 135 M
78 161 M 58 131 F 69 178 M
Problem 2. Lengths and widths were measured for two types of aphids (a small beetle) collected in a semitropical country. The entries in Table 2 are the lengths and widths, respectively, for 56 aphids. Units are in tenths of millimeters.
Table 2 --- Lengths and widths (0.1mm) for 56 aphids.
Type A (n=17):
258 237 273 226 287 210 289 231
304 237 309 207 311 237 314 234
319 197 330 216 333 185 335 187
342 189 352 195 357 200 365 201
371 185
Type B (n=39):
239 241 256 228 260 213 266 207
271 226 273 187 278 230 280 220
281 183 284 200 286 191 291 214
292 233 293 199 296 195 296 205
300 228 302 200 303 198 303 203
307 215 312 191 318 229 321 181
322 193 322 193 322 219 323 197
326 217 328 178 328 190 330 178
335 187 339 175 340 191 346 177
346 183 358 178 360 177
Nreading.sas and Ncoffee.sas on the Math475 Web
site, and also in HotLizards.sas. In all three SAS example
input files, the MANOVA code is at the end.)
Problem 3. A manufacturing company with four factories wants to control the number of defects in the main product that it manufactures. As a first step, the company wants to know where most of the variation of the defects is located: among factories, among groups (workgroups) working within the same factory, or from month to month within the same workgroup.
Table 3 --- Product defects by factory and workgroup
Factory1:
Group1 30 15 17 Group2 22 6 31 Group3 21 26 15
Factory2:
Group1 32 30 32 Group2 31 27 21 Group3 32 31 35
Group4 27 50 36 Group5 21 29 34
Factory3:
Group1 20 30 29 Group2 21 27 21 Group3 28 23 33
Group4 14 14 25
Factory4:
Group1 20 26 26 Group2 23 19 17 Group3 36 30 32
Group4 11 29 14 Group5 17 22 35
Note that ``Group1'' does not refer to the same group in different
factories, but only to the first workgroup from that factory that
happened to send data to the parent company. (Treat the three
observations for each workgroup as independent and identically
distributed samples for that workgroup.)
Problem 4. An engineer is interested in the running temperature of a mechanical device as a function of three variables: Heat-shield type, with two levels (H1,H2), Fan size, with three types (F1,F2,F3), and heat baffle type, with five levels (B1,B2,B3,B4,B5). One observation of the running temperature is made for each set of levels of the three variables. The running temperatures are listed in Table 4.
Table 4. Running Temperatures of a Device
B1 B2 B3 B4 B5
F1 H1 199 175 187 169 189
H2 196 196 221 196 244
F2 H1 203 182 178 181 193
H2 176 179 217 245 244
F3 H1 177 173 178 184 174
H2 166 204 207 205 284
ThreeWay.sas on the Math475 Web site, or else by following
one of the suggestions on page 219-220 in the textbook.)
Problem 5. A study of nerve fibers is made for 5 normal and 5 diabetic rats. The experimenter wants to learn how the cross-sectional areas of the nerve fibers of a particular nerve varies with the diabetic state, and also how this varies with the position along the nerve fibers (Proximal, Medial, or Distal). For definiteness, let Group be the factor whose levels are Normal and Diabetic, and NvLoc a factor with levels Proximal, Median, and Distal. The nerve cross-sectional areas for the 10 rats are in Table 5.
Table 5 --- Cross-Sectional Areas of Nerve Fibers in 10 Rats
Subj Proximal Medial Distal
Diabetic 1. 529 446 373
Diabetic 2. 604 455 404
Diabetic 3. 523 500 378
Diabetic 4. 504 392 390
Diabetic 5. 518 486 375
Control 6. 394 360 513
Control 7. 352 395 529
Control 8. 370 317 571
Control 9. 261 370 586
Control 10. 348 400 530
NestedSubj2Fac.sas for the appropriate decomposition of a
full factorial ANOVA model in this case. See comments in
NCoffee.sas, NReading.sas, or the text for the
``standard way'' to test effects in nested subject models with one
observation per cell.)