Instructor:
Todd Kuffner (
kuffner@math.wustl.edu)
Lecture: 4:00-5:30pm, Monday/Wednesday, Simon Hall, Room 018
Office
Hours: Monday 8:00-9:00am, Tuesday 4:00-6:00pm; Cupples I Room 18 (basement).
Course
Overview: This
course provides students with an introduction to the foundations of
modern computational statistics. Students will learn the basics
of numerical analysis, random number
generation, and computational tools for statistical inference,
specifically Monte Carlo methods and the bootstrap. Students will be
introduced to
SAS during the first part of the course. Thereafter,
students are welcome to use
R or
SAS
(or both) for the relevant parts of homework assignments.
Prerequisite:
It is assumed that students have taken a first course in
multivariate-calculus-based-probability (including central limit
theorems, laws of large numbers, transformations of variables), a first
course in linear or matrix algebra, a course in statistics (including
the principles of statistical inference, common estimation methods such
as maximum likelihood), and have some familiarity with programming in
either
R or
SAS.
Textbook:
Both books are required. You may have electronic access through
Washington University; log in to My Catalog on the library website and
search for these books. Recommended readings for each lecture will
consist of sections from these books.
- The Little SAS Book: A Primer (5th edition, 2012) by Lora D. Delwiche and Susan J. Slaughter
- Computational Statistics (1st edition, 2009) by James E. Gentle errata
Software: Washington University Arts & Sciences provides SAS through Remote Desktop.
Instructions here. SAS is also available in some computer labs on campus (ask Arts & Sciences computing). Alternatively, students may obtain
SAS University Edition. Later in the course, students may choose to use
R, which is free and open source.
Homework: There
will be regular homework assignments. For the first part of the course,
you may find example code and data from the The Little SAS book here:
http://support.sas.com/publishing/authors/delwiche.html . Homework will be graded, but solutions will not be provided to students.
Homework grader: Yiqian Fang (
yiqianfang@wustl.edu)
Blackboard:
During the semester, homework assignments, homework and midterm exam
grades and any other course-related announcements will be posted to
Blackboard or sent by email using Blackboard.
Attendance:
Attendance is required for all lectures. The student who misses a
lecture is responsible for any assignments and/or announcements
made.
Grades:
15% Homework, 20% Midterm 1, 20% Midterm 2, 45% Final
Exams: 2 in-class midterms and 1 final.
The dates of the exams
should not be considered fixed until the first day of class. What
appears on Course Listings may be incorrect.
Homework: There will be weekly homework assignments.
The lowest homework grade will be dropped. If you added the class late and missed the first homework, then that is the homework that will be dropped.
Final Course Grade: The letter grades for the course will be determined
according to the following numerical grades on a 0-100 scale.
A+
|
impress me
(very rare)
|
B+
|
[87, 90)
|
C+
|
[77, 80)
|
D+
|
[67, 70)
|
F
|
[0,60)
|
A
|
93+
|
B
|
[83, 87)
|
C
|
[73, 77)
|
D
|
[63, 67)
|
|
|
A-
|
[90, 93)
|
B-
|
[80, 83)
|
C-
|
[70, 73)
|
D-
|
[60, 63)
|
|
|
Other
Course Policies: Students are encouraged to look at the Faculty
of Arts & Sciences
policies.
- Academic integrity:
Students
are expected to adhere to the University's policy
on academic
integrity.
- Auditing: There is an
option to audit, but this
still involves enrolling in the course. See the Faculty of Arts &
Sciences policy
on auditing.
Auditing students will still be expected to attend all
lectures and compete all required coursework and exams. A course grade
of 75 is required for a successful audit.
- Collaboration: Students
are encouraged to discuss homework with one another, but each student
must submit separate solutions, and these must be the original work of
the student.
- Exam conflicts: Read the
University policy.
The exam dates for this course are posted before the semester begins,
and thus you are expected to be present at all exams.
- Late homework: Only by
prior arrangement. If a valid reason for an exception is not presented
at least 36
hours before a homework due date, then it will not be accepted late (a
zero will be given for that assignment).
- Missed exams: There are
no make-up exams. For valid excused absences with midterm exams - such
as medical, family, transportation and weather-related
emergencies - the contribution of that midterm to the final course
grade will be redistributed equally to the other midterm exam and final
exam. Students missing both midterm exams and/or the final exam cannot
earn a passing grade for the course.
Course Schedule: tentative;
will be updated after lecture to reflect what was actually covered;
LSB=Little SAS Book, CS=Computational Statistics
Lecture 1: Overview
Roles of estimation, simulation, and optimization in statistical inference
Reading : CS 1.1-1.8 (review of prerequisite material); LSB 1.1-1.13
HW1 assigned
|
Lecture 2: Computer Storage and Arithmetic
Fixed-point and floating-point number systems; errors
Reading: CS 2.1-2.3; LSB 2.1-2.21
|
Lecture 3: Algorithms and Programming I Numerical errors; algorithms and data
Reading: CS 3.1-3.2; LSB 3.1-3.12
|
Lecture 4: Algorithms and Programming II Efficiency
Reading: CS 3.3; LSB 4.1-4.24
HW3 assigned
|
Lecture 5: Algorithms and Programming III
Iterations and convergence; programming; computational feasibility
Reading: CS 3.4-3.6; LSB 4.1-4.24
|
Lecture 6: Function Approximation
Function approximation and smoothing; basis sets in function spaces
Reading: CS 4.1-4.2 (see Ch. 10 for perspective)
|
Lecture 7: Vector Spaces Review
|
Lecture 8: Function Approximation II
Review of Taylor series expansions for multi-variable functions; Inner
products on function spaces; orthogonal polynomials; Applications of
orthogonal polynomials (refinements of classical
univariate central limit theorem with Edgeworth expansions in
orthogonal Hermite polynomials)
Reading: CS 4.3-4.4 (see Ch. 10 for perspective)
HW3 due
|
Lecture 9: Function Approximation III Splines
Reading: CS 4.4
|
Lecture 10: Review for first Midterm
Unconstrained descent methods in dense domains; unconstrained combinatorial and stochastic optimization
|
Lecture 11:
Kernel methods
Reading: CS 4.5 (see Ch. 10 for perspective); LSB 5.1-5.13
|
Midterm 1 during class on Monday 10th October
Material: Lectures 1-10
|
Lecture 12: Introduction to integral approximation
Why must we approximate integrals? Liouville's theorem; Risch algorithm; statistical examples; overview of approximation methods
Reference: see slides
|
Lecture 13: Gaussian Quadrature
Reference: CS Ch. 4
|
Lecture 14: Basics of Bayesian Computational Statistics
Motivating uses of integral approximation in statistical inference; common setting of MCMC
Reference: slides
|
Lecture 15: Saddlepoint and Laplace Approximation
Deterministic integral approximation methods; Bayesian logistic regression example
Reference: slides
|
Lecture 16: Random Variable Generation; Monte Carlo Integration
Quantile transform method; rejection sampling; importance sampling
Reference: CS Ch. 7, 11 and Appendix A
|
Lecture 17: More on RNG; Intro to MCMC
Types of pseudo-random number generators; basics of Markov chain theory
Reference: CS Ch. 7, 11 and Appendix A
|
Lecture 18: More Markov chain theory; Metropolis-Hastings and Gibbs
Independent and random walk Metropolis-Hastings; optimal scaling and convergence diagnostics
Reference: CS Ch. 11 and slides
|
Midterm 2 during class on Wednesday 9th November
Material: Lectures 11-17
|
Lecture 19: MCMC Examples
Metropolis-Hastings and Gibbs examples; convergence diagnostics (Gelman-Rubin); writing R functions for MCMC
Reference: see Blackboard articles
|
Lecture 20: Introduction to Bootstrap
Nonparametric bootstrap; bias estimation and standard error estimation; jackknife
Reference: CS Ch. 12 and 13
|
Lecture 21: Bootstrap Confidence Intervals
Normal, percentile and bootstrap t intervals; BCa intervals; implementation in R
Reference: CS Ch. 12 and 13
|
Lecture 22: Cross-Validation and Permutation Tests
Rference: CS Ch. 12 and 13
|
Lecture 23: Current Research in Computational Statistics
Reference: slides
|
Last day of fall semester classes is 12/09
|
Final Exam is Friday 12/16, 6:00-8:00pm (see your exam schedule for the room)
|