Time & Location: Lecture: MW 4:30–5:45 pm, 1229 EB2. Discussion: F 12:50–1:40 pm, 1229 EB2.
Instructor: Dr. Chau-Wai Wong
Teaching Assistants: Ms. Chanae Ottley
Office Hours Scheduled (Zoom link can be found here. If none of the timeslots work for you, you may request an office hour session via a Piazza private message.)
Day | Time | Place | Person |
---|---|---|---|
Mondays | 11 am-12 noon | Open space outside 2116 EB2 | Ottley |
Mon & Wed | After lectures | Classroom or office | Wong |
Thursdays | 9:30-10 am | Zoom | Wong |
Sundays | 2-2:30 pm | Zoom | Wong |
Course Forum: Piazza
Homework Submission: Gradescope
Course Description: Deep learning progressed remarkably over the past decade and virtually every single industry has seen the encouraging potential brought by the application of deep learning. This course introduces fundamental concepts and algorithms in machine learning that are vital for understanding state-of-the-art and cutting-edge development in deep learning. This course exposes students to real-world applications via well-guided homework programming problems and a term project.
Prerequisites: (i) ST 300-level or above, and (ii) ECE301/CSC316/ISE361/MA341. Email the instructor your transcript to discuss if the second prerequisite can be waived.
Course Structure: Course Structure: The course consists of two 75-min lectures and one 50-min discussion section per week. A teaching assistant will lead the discussion section, covering practice problems and answering questions from students. There will be weekly homework assignments (30%) that contain both written problems and programming problems, two midterm exams (20%×2), and one term project (30%). Programming will be in Python, R, or Matlab. Students are expected to be able to write computer programs and have mathematical maturity in probability theory (e.g., have taken a 300-level statistics course) before taking the course. A linear algebra course such as MA305/405 is recommended while taking the course.
Textbooks:Topics:
Classical topics: regression, classification, support vector machines, and cross-validation.
Cutting-edge topics: convolutional neural networks (CNN), long short-term memory (LSTM), transformers (e.g., BERT and GPT), and diffusion models.
Class # | Date | Topic | Lecture notes | Readings | HW Assignment | |
---|---|---|---|---|---|---|
1 | 8/19 | Introduction | Slide deck 1 | ISLR Ch1–2; ML Supp | HW1 (Due 8/26) | |
2 | 8/21 | Machine learning overview | ISLR Ch1–2 | |||
3 | 8/26 | Supervised learning | ISLR Ch1–2 | HW2 (due 9/9) | ||
4 | 8/28 | Linear regression, Matrix-vector form Least squares; Vector space |
Scheffe Ch1 Scheffe App 1 |
|||
Deep Learning | ||||||
9/2 | Labor Day | |||||
5 | 9/4 | Geometric interpretation; Modern ML applications - CNN | Slide deck 2 | DL Ch6, Ch9 | HW3 (due 9/19) | |
6 | 9/9 | Neural network training: Backpropagation | DL Ch8, Ch11 | |||
7 | 9/11 | Neural network training: Backpropagation (cont'd) | HW4 (due 9/25) | |||
8 | 9/16 | Modern ML applications - LSTM, Transformers (BERT & GPT) | DL Ch10 | |||
9 | 9/18 | Modern ML applications - Transformers (BERT & GPT) (cont'd) | ||||
10 | 9/23 | Modern ML applications - Diffusion models | DDPM, SDXL, DALL·E 3 | |||
Linear Statistical Models: Regression | ||||||
11 | 9/25 | Regression function, Conditional expectation | HT Ch2 | ISLR Ch2; Devore 3.3, 4.2, 2.4; Leon 3.2, 5.7 |
||
9/30 | Exam 1 | |||||
12 | 10/2 | Probability theory review, Regression function (cont'd) | ISLR Ch2; Devore CH6 | |||
13 | 10/7 | Curse of dimensionality, Model accuracy, Bias-variance trade-off |
HT Ch2 | ISLR Ch2 | HW5 (due 10/21) | |
Classification | ||||||
14 | 10/9 | Logistic regression | HT Ch4 | ISLR 4.1–3, ESL 4.4 | ||
10/14 | Fall Break | |||||
15 | 10/16 | MLE & Invariance principle; Link function for GLM |
Devore 6.2; McCulloch 5.1–4 |
HW6 (due 10/30) Project |
||
16 | 10/21 | Linear discriminant analysis | ISLR 4.4, Leon 6.3.1, 6.4 ISLR 4.4.3, Murphy 5.7.2.1 |
|||
17 | 10/23 | Error types, ROC, AUC, EER | ESL 6.6.3, Murphy 3.5; ISLR 4.5; Devore 6.2 |
HW7 (due 11/6) | ||
Other Topics | ||||||
18 | 10/28 | Naive Bayes, Logistic vs. LDA; Cross-validation |
HT Ch5 | ISLR 5.1 | Lab | |
19 | 10/30 | Talks & panel; Cross-validation (cont'd) | HT Ch5 | ISLR 5.1 | HW8 (due 11/21) | |
20 | 11/4 | Cross-validation (cont'd); Bootstrap | HT Ch5 | ISLR 5.2 | ||
21 | 11/6 | Regularization | HT Ch6 | ISLR 6.2 | ||
22 | 11/11 | Support vector machine (SVM) | HT Ch9 | ISLR Ch9 | ||
23 | 11/13 | Unsupervised learning: Clustering; PCA | HT Ch10 | ISLR 10.2, 10.3 | ||
11/18 | Exam 2 | |||||
24 | 11/20 | LLM "secret sauce"; Invited ML career talk (last 15 mins) | ||||
Classification (cont'd) | ||||||
25 | 11/25 | Confidence interval, Hypothesis test | HT Ch3 | ISLR 3.1, Devore 7.1, 8.1, 8.3 |
||
11/27 | Thanksgiving Holiday | |||||
26 | 12/2 | Hypothesis test (cont'd) Optional: Multiple regression, F-statistic Qualitative predictors, Interaction |
HT Ch3 | ISLR 3.2, ESL 3.2 ISLR 3.3.1–2 |