Time & Location: Lecture: MW 4:30–5:45 pm, 2236 EB3. Discussion: F 12:50–1:40 pm, 1005 EB1.
Instructor: Dr. Chau-Wai Wong
Teaching Assistants: Mr. Prasun Datta, Ms. Chanae Ottley, and Mr. Anupam Mijar
Office Hours Scheduled (Zoom link can be found here. If none of the timeslots work for you, you may request an office hour session via a Piazza private message.)
Day | Time | Place | Person |
---|---|---|---|
Mondays | 11 am-12 noon | Zoom | Mijar |
Thursdays | 9-9:30 am | Zoom | Wong |
Thursdays | 1:30-2:30 pm | 2117 EB2 | Datta/Ottley |
Sundays | 2-2:30 pm | Zoom | Wong |
Course Description: Deep learning progressed remarkably over the past decade. This course introduces fundamental concepts and algorithms in machine learning that are vital for understanding state-of-the-art and cutting-edge development in deep learning. This course exposes students to real-world applications via well-guided homework programming problems and projects. Topics include, but are not limited to regression, classification, support vector machines, crossvalidation, and convolutional neural networks (CNN), long short-term memory (LSTM), and transformers (e.g., BERT and GPT).
Prerequisites: ST 300-level or above, and ECE301/CSC316/ISE361/MA341. Talk to the instructor if prerequisites can be waived.
Course Structure: Course Structure: The course consists of two 75-min lectures and one 50-min discussion section per week. A teaching assistant will lead the discussion section, covering practice problems and answering questions from students. There will be weekly homework assignments (30%) that contain both written problems and programming problems, two midterm exams (20%×2), and one term project (30%). Programming will be in Python, R, or Matlab. Students are expected to be able to write computer programs and have mathematical maturity in probability theory (e.g., have taken a 300-level statistics course) before taking the course. A linear algebra course such as MA305/405 is recommended while taking the course.
Course Forum: Piazza
Homework Submission: Gradescope
Textbooks:Topics: Linear statistical models, Bayesian classifiers, support vector machine (SVM), clustering, principal component analysis (PCA), naive Bayes, topic model, hidden Markov model (HMM), convolutional neural networks (CNN), long short-term memory (LSTM), and transformers.
Class # | Date | Topic | Lecture notes | Readings | HW Assignment | |
---|---|---|---|---|---|---|
1 | 8/21 | Introduction | Video: Can We Build a Brain? ISLR Ch1–2; ML Supp |
HW1 (Due 8/28) | ||
2 | 8/23 | Machine learning overview | Slide deck 1 | ISLR Ch1–2 | ||
3 | 8/28 | Supervised learning | ISLR Ch1–2 | HW2 (due 9/11) | ||
4 | 8/30 | Linear regression, Matrix-vector form; Least squares; Linear algebra |
Scheffe Ch1 Scheffe App 1 |
|||
Deep Learning | ||||||
9/4 | Labor Day | |||||
5 | 9/6 | Linear algebra (cont'd); Geometric interpretation | HW3 (due 9/21) | |||
6 | 9/11 | Modern ML applications - CNN | Slide deck 2 | DL Ch6, Ch9 | ||
7 | 9/13 | Modern ML applications - LSTM | DL Ch10 | HW4 (due 9/27) | ||
8 | 9/18 | Modern ML applications - Transformers (BERT & GPT) | ||||
9 | 9/20 | Neural network training: Backpropagation | DL Ch8, Ch11 | |||
Linear Statistical Models: Regression | ||||||
10 | 9/25 | Backpropagation(cont'd); Regression function | HT Ch2 | ISLR Ch2; Devore 3.3, 4.2, 2.4; Leon 3.2, 5.7 |
||
11 | 9/27 | Conditional expectation | ISLR Ch2; Devore CH6 | |||
10/2 | Exam 1 | HW5 (due 10/16) | ||||
12 | 10/4 | Probability theory review | ISLR Ch2; Devore CH6 | |||
10/9 | Fall Break | |||||
13 | 10/11 | Curse of dimensionality, Model accuracy, Bias-variance trade-off |
HT Ch2 | ISLR Ch2 | HW6 (due 10/25) | |
14 | 10/16 | Confidence interval | HT Ch3 | ISLR 3.1, Devore 7.1, 8.1, 8.3 |
||
15 | 10/18 | Hypothesis test (in class) Multiple regression, F-statistic Qualitative predictors, Interaction (in HW) |
HT Ch3 | ISLR 3.2, ESL 3.2 ISLR 3.3.1–2 |
HW7 (due 11/1) | |
Classification | ||||||
16 | 10/23 | Hypothesis test (cont'd), Logistic regression | HT Ch4 | ISLR 4.1–3, ESL 4.4 | ||
17 | 10/25 | Logistic regression (cont'd); MLE | Devore 6.2 | Project | ||
18 | 10/30 | MLE (cont'd) & Invariance principle Link function for GLM |
HT Ch4 | McCulloch 5.1–4 | HW8 (due 11/13) | |
19 | 11/1 | Linear discriminant analysis | HT Ch4 | ISLR 4.4, Leon 6.3.1, 6.4 ISLR 4.4.3, Murphy 5.7.2.1 |
||
20 | 11/6 | Error types, ROC, AUC, EER Naive Bayes; Logistic vs. LDA |
ESL 6.6.3, Murphy 3.5; ISLR 4.5; Devore 6.2 |
HW9 (due 11/15) extended to 11/21 |
||
Other Topics | ||||||
21 | 11/8 | Cross-Validation | HT Ch5 | ISLR 5.1 | ||
22 | 11/13 | Cross-Validation (cont'd); Bootstrap | HT Ch5 | ISLR 5.1; 5.2 | ||
23 | 11/15 | Bootstrap (cont'd); Regularization | HT Ch5; HT Ch6 | ISLR 5.2; 6.2 | ||
11/20 | Exam 2 | |||||
11/22 | Thanksgiving Holiday | |||||
24 | 11/27 | Regularization (cont'd) | HT Ch6 | ISLR 6.2 | ||
25 | 11/29 | Support vector machine (SVM) | HT Ch9 | ISLR Ch9 | ||
26 | 12/4 | Diffusion models Unsupervised learning: Clustering; PCA; Topic model; HMM |
Slides HT Ch10 |
DDPM,
SDXL,
DALL·E 3
ISLR 10.2, 10.3; Topic model 1, 2 |