Math 462: Mathematical Foundations of Data Science/Big Data
Spring 2020

Instructor: Todd Kuffner

Lecture: MWF 9:00-9:50am

Course Description: Mathematical and statistical foundations of data science. Core topics include: High-dimensional probability; concentration of measure; matrix concentration inequalities; essentials of random matrix theory; linear dimension reduction. Other topics will be chosen by the instructor.

List of Topics (tentative):
Prerequisite: Multivariable calculus (Math 233), linear or matrix algebra (Math 429 or 309), and multivariable-calculus-based probability and mathematical statistics (Math 493-494). Prior familiarity with analysis, topology, and geometry is strongly recommended. A willingness to learn new mathematics as needed is essential. In particular, we will make heavy use of concepts related to vector spaces and function spaces.

Textbook: There is no required textbook for the course, though for some parts of the course we will closely follow Roman Vershynin's wonderful 2019 book, High-Dimensional Probability, published by Cambridge University Press. The lectures and that book are the primary references for the course, but some reference books as well as freely-available references may also be suggested. Details will be posted on Canvas.

Important Dates and Course Schedule:   Details will be posted on Canvas. I will probably update the table below later in the semester to detail what was covered for future reference.

Jan. 13
First day of classes
Jan. 20
No class (Martin Luther King Holiday)
Jan. 23
Last day to drop/add
March 9-13
No classes (Spring Break)
April 24
Last day of classes


Course Policies and Grades

Canvas: During the semester, all course-related materials and announcements will be posted to Canvas and/or sent by email to registered students.

Grades: Homework 25%, Midterm 20%, Final Exam 25%, Participation 10%, Group Project & Presentation 20%

Homework:
Roughly 1 homework for every 4-5 lectures. You may discuss problems with other students, but the solutions you submit must be entirely your own work. Explanations detailing the steps of proofs or other mathematical arguments are required for full credit. You are encouraged, but not required, to write your solutions in TeX/LaTeX, and submit the printed version. I will drop the lowest homework grade under the condition that you have submitted all homeworks and genuinely attempted all of the problems; I will not drop the lowest homework grade if you did not do this.

Exams: There will be a take-home midterm exam and a take-home final exam.

Participation: Attendance and participation are required for all lectures. Attendance is not enough. Participation includes: (i) answering questions that I ask the class; (ii) providing a summary, definition, or result from the previous lecture when I ask you to.

Group Project & Presentation: Groups will be assigned. Each group will be given a project, which may include reading a paper on a topic not covered during lectures, or doing a literature search on an open problem in the mathematical foundations of data science. The group must submit a 5-10 page report, written in LaTeX, and prepare a 25-minute presentation for the rest of the class using slides (made with the Beamer document class in LaTeX). The speaking roles in the presentation must be shared equally with all members of the group. The final report and presentation will be due during the final two weeks of classes.

Final Course Grade: The letter grades for the course will be determined according to the following numerical grades on a 0-100 scale.
A+
impress me
B+
[87, 90)
C+
[77, 80)
D+
[67, 70)
F
[0,60)
A
93+
B
[83, 87)
C
[73, 77)
D
[63, 67)


A-
[90, 93)
B-
[80, 83)
C-
[70, 73)
D-
[60, 63)




Other Course Policies: Students are encouraged to look at the Faculty of Arts & Sciences policies.