Times: Tuesday and Thursday, 9:30-10:45 am
Location: ISyE Instructional Center 115
Instructor: Vidya K Muthukumar (vmuthukumar8@gatech.edu)
UTA: Sarah Friedrichs (sfriedrichs3@gatech.edu), Jack Ganem (jganem6@gatech.edu)
Office Hours: Instructor OH Tuesdays 11 am-12 pm, Groseclose 336 (virtual option also available)
Homework discussion OH Thursdays 12 pm-1 pm, venue TBD (somewhere in Groseclose)
Prerequisites: ISYE 3133 (Optimization) and ISYE 2027 (Probability). Mathematical maturity and familiarity with working with abstract mathematical notation will be assumed. A few elementary proof-based arguments will also be covered.
Brief description: At the heart of most machine learning applications today – like advertisement placement, movie recommendation, and node prediction in evolving networks – is an optimization engine trying to provide the best decision with the information observed thus far in time, i.e. the problem of online learning. To solve these problems, one must make online, real-time decisions and continuously improve the performance with the arrival of data and feedback from previous decisions. The course aims to provide a foundation for the development of such online methods and for their analysis. We will discuss fundamental principles for learning from an unknown environment, limited feedback, and learning with dynamic, long-term consequences.
Upon successful completion of this course, students will be able to:
- Understand where online learning is applicable in many real-world scenarios.
- Develop algorithms that combine partial information as best as possible to make online decisions.
- Understand how exploration of decision space and exploitation from historic data must be prioritized to be able to reach optimal decisions.
- Understand the dynamic-programming principle of sequential decision-making when decisions have long-term consequences, and appreciate principles for learning in such long-term environments.
Grading/Format: The course will be graded as follows:
- Assignment 0: 2% bonus (released Aug 20, due Aug 27, covers course prerequisites)
- Homeworks (4 in total): 50%
- Two in-class midterms (tentative dates: ): 30% in total
- Final exam (Dec 9, 8-10:50 am): 20%
Lecture medium and study resources: All lectures will be in person and on the board (projected through tablet). I will record lectures and make the recording and accompanying handwritten notes available a few hours after lecture. These recordings are intended to be an accompaniment to in-person attendance rather than a substitute.
In addition to this, I also will provide supplemental type-written notes.
Piazza/Canvas: The primary mode of interactive communication in this course will be Piazza. Please sign up at the course page, and monitor Piazza for announcements regarding lecture, homeworks and exams. As is standard, we will also use Canvas to keep track of assignments and share resources related to the class.
Resources and schedule
Lecture schedule (tentative, subject to change)
Date | Topic | Additional resources guide (internally available on Canvas/Piazza) |
20 Aug | Logistics and introduction | Lecture recording and slides |
22 Aug | Review session on prerequisites | |
27 Aug | Introduction to prediction of an adversarial binary sequence and the halving algorithm | |
29 Aug | Binary sequence prediction: The halving algorithm and the weighted majority algorithm | |
3 Sep | Binary sequence prediction: The randomized weighted majority algorithm | |
5 Sep | Prediction with expert advice: General loss functions | |
10 Sep | Prediction with expert advice: General loss functions (continued) | |
12 Sep | Application: solving linear programs | |
17 Sep | Application: solving linear programs (continued) | |
19 Sep | Review: Online sequence prediction | |
24 Sep | In-class midterm 1 | |
26 Sep | Midterm 1 discussion | |
1 Oct | Limited-information feedback (bandits): Introduction and heuristics | |
3 Oct | Bandits, pure-greedy and epsilon-greedy algorithms | |
8 Oct | Bandits, UCB | |
10 Oct | Bandits, UCB (continued) | |
15 Oct | NO CLASS: Fall Break | |
17 Oct | Bandits, Thompson sampling (virtual lecture due to instructor travel) | |
22 Oct | Review: Bandits | |
24 Oct | In-class midterm 2 | |
29 Oct | Midterm 2 discussion | |
31 Oct | Bandits, Thompson sampling (continued) | |
5 Nov | Bandits, recommender systems | |
7 Nov | Bandits, recommender systems (continued) | |
12 Nov | Dynamic programming and optimal control | |
14 Nov | Dynamic programming and optimal control (continued) | |
19 Nov | A birds’ eye view of reinforcement learning | |
21 Nov | Fairness and ethical considerations in decision making | |
26 Nov | Fairness and ethical considerations in decision making (continued) | |
28 Nov | NO CLASS: Thanksgiving | |
Dec 3 | LAST DAY OF CLASS: Review of the entire semester | |
Dec 9 | Final exam (8-10:50 am) |
Assignment schedule (tentative, subject to change)
Assignment | Release date | Due date | Scope of homework |
0 | 20 Aug | 27 Aug | Prerequisites |
1 | 29 Aug | 19 Sep | Binary and general loss sequence prediction |
2 | 19 Sep | 11 Oct | General loss sequence prediction and LPs |
3 | 15 Oct | 5 Nov | Limited-information feedback (bandit) algorithms |
4 | 7 Nov | 3 Dec | Bandit algorithms, recommender systems and dynamic programming |