Lifetime data (survival data) are commonly encountered in
epidemiology, biostatistics, biomedicical and clinical studyies. Different from other type of data, survival times usualy have skewed distribution and include censoring observations. For example, in a cancer study, a patient may leave the country and stop visiting clinical center before responding to treatment. Statistical softwares SAS and R will be used to handle problems when hand calculation is not feasible. Techeniques developped in survival analysis are also used in reliability research of business and engineering. This is an upper level of undergraduate or master level course. It is useful for Actuarial Exams and important for biostatisticians.
Instructor: Jimin Ding;
Office Hours: Thur. 4:00-5:30pm. or by appointment
Topics covered:
Survival and hazard functions, suitable parametric distributions for lifetime data, life-table techinque and population stndardization, Kaplan-Meier (product climit) survival curve estimation, nonparametric hypothesis testing for censored data, log-rank test, parametric regression models for lifetime data, logistic regression, Cox proportional hazard models, Cox model with time-independent and time-dependent covariates.
Prerequisites:
Math 309 and Math 320, or equivlanets.
Textbook:
John P. Klein and Melvin L. Moeschberger
Survival Analysis, 2nd edition
Springer, 2005, ISBN 038795399X
(http://www.biostat.mcw.edu/homepgs/klein/book.html)
Homeworks and Exams:
There will be 3 take-home projects due on Sep. 29th, Oct. 29th. and Dec. 1st. (The problem assignments will be given at least 2 weeks before the due date.) In the last week of the class (Dec. 1st and 3rd), every student will have 10 minutes to present their third homework and 2 minutes to answer questions from teacher or students. If you turn in your homeworks within 24 hours after due date, the grade will be scaled by 60%. No homework after 24 hours of the due date will be graded.
In-class midterm will be given on Oct.15 (Thur.).
Grades:
There will be three homework projects, one midterm. Grades will be based on the homework sets (around 65%), the presenation (around %10) and the midterm (around 20%).
Electonic files of your programs in the projects and final are required within 24 hours of due day.
Submitting efiles/ sending emails will be counted for 5% of your grade.
Efiles should be sent through email directly and will be used to only random check.
Collaboration:
Collaboration on homework is allowed and can be helpful (and fun).
Collaboration on homework is encouraged, both for using the computer and
for doing problems. However, you must do all written work by yourself,
both computer programs and answers to homework questions. You must also
write, enter, and run all programs yourself.
If you collaborate with someone on a homework, list his or her
name in a note at the top of the first part of your homework.
There should be NO COLLABORATION on the take-home final, other than for
the mechanics of using the computer.
WARNING:
Make a copy of each homework before you hand it in !!
It may not be returned before you need to refer to it for the next
homework (or for the next test).
Format of your homework:
- Part I: Answer all questions part by part in writing or printing. Cite the output from appendix to support your argument and conclusion.
If the answer to a problem requires a table or a plot that you need to refer to in your answers, add page numbers to your homework and make
references in part I by page number, such as ``The scatterplot
for part (c) is on page #X in the SAS output below.''
Alternatively, you could copy outputs and
include it in part I along with annotations as well as in
part III, but references by page number will usually be enough.
If a problem asks you to do a statistical test, EXPLAIN CLEARLY what
the null hypothesis H_0 is, what test you used, what the P-value is, and
whether the data is significant, highly significant, or neither.
- Part II: Attach used SAS or R programs as appendix if you have used any of them to make your argument or conclusion in part I.
All programs should be structured, or have enough comments, so that
someone who looks at the program a year from now can easily tell what the
program is doing. For program in SAS, it is even better if descriptive comments can be put in
title (or title2 or title3) statements, since
these will appear in the SAS output as well as in the SAS program.
Programs may be graded for understandability.
- Part III: Attach nessary outputs from your programs to support part I.
Some useful links and references:
Guide to USING SAS by Prof. Stanley Sawyer
SAS Online Printed Manuals: detailed decriptions about SAS procedures, including all available options and related statistical theories. See SAS Online Manuals by Prof. Sawyer for more details.
Using the SAS Windowing Environment: A Quick Tutorial,L. Hatcher, SAS Institute Press, 2001.
The statistical analysis of failure time data, 2nd Edition, J. D. Kalbfleisch and R. L. Prentice (2002), John Wiley & Sons.
Statistical Methods for Survival Data Analysis, Elisa Lee and John Wang, 3rd Ed., John Wiley & Son, 2003.
Survival analysis: a self-learning text, David G. Kleinbaum, Springer, 1996.
Survival analysis using the SAS system: a practical guide, Paul D. Allison, SAS Institute Press,1995.
Good books for reviewing elementary statistics:
A Data-Based Approach to Statistics,R. L. Iman,
Duxbury Press, 1994.
Statistics and Data Analysis
from Elementary to Intermediate, A. J. Tamhane and D. D. Dunlop, Prentice-Hall, 2000.