ECE 8803: Online Decision Making in Machine Learning (Spring 2025)

Times: Tuesday and Thursday, 3:30-4:45 pm

Location: MoSE building, room 1224

Instructor: Vidya K Muthukumar (vmuthukumar8@gatech.edu)

Office Hours: Tuesday and Thursday 4:45-5:30 pm (after class), location TBD

Prerequisites: undergraduate probability (ECE3077 or equivalent), undergraduate linear algebra (MATH 2551 or equivalent). Mathematical maturity and familiarity with proof-based arguments will be assumed.

Brief description: In many applications of machine learning (ML), data is collected sequentially; moreover, decisions can impact performance both in the present and the future. This class will deal with the design of ML algorithms for real- time decision making, including reinforcement learning. Classical applications in engineering and modern applications in the ML pipeline will both be discussed, but the focus of the course will be foundational — on understanding design principles and the inner workings of algorithms for online decision-making.

Upon successful completion of this course, students will be able to:

Understand and explain the basic design principles of any online algorithm under diverse assumptions on the environment and reward feedback mechanism.
Understand how these principles relate to classical concepts in information theory, signal processing, communications, and control theory.
Assess the efficacy of an online algorithm for an engineering/machine learning application based on its performance guarantees, tractability of implementation, scalability and assumptions made on the environment.
Appreciate how online algorithms relate to other aspects of the machine learning pipeline.

Grading/Format: The course will be graded as follows:

Homeworks (top 4/5): 45%
Midterm (take-home, tentative dates March 13-14): 25%
Course project: 30%

Piazza/Canvas: The primary mode of interactive communication in this course will be Piazza. Please sign up at the course page, and monitor Piazza for announcements regarding lecture, homeworks, midterm and project. As is standard, we will also use Canvas to keep track of assignments and share resources related to the class.

Resources and schedule

Lecture schedule (tentative, subject to change)

Date	Topic	Resources
7 Jan	Logistics and introduction
9 Jan	Review session on probability and basics of ML	Probability review notes ML review notes
14 Jan	Basics of prediction of an adversarial sequence	Lecture note
16 Jan	The multiplicative weights algorithm	Lecture note
21 Jan	Decision-making using expert advice; application to linear programs	Lecture note
23 Jan	No-regret through perturbation	Lecture note
28 Jan	No-regret through perturbation	Lecture note
30 Jan	No-regret through perturbation, introduction to online linear optimization	Lecture note
4 Feb	Online linear optimization	Lecture note
6 Feb	Online convex optimization and stochastic optimization	Lecture note
11 Feb	Overview of adaptive methods in online learning	Lecture note
13 Feb	Online learning and zero-sum game theory	Lecture note Extra note
18 Feb	Introduction to limited-information feedback	Lecture note
20 Feb	Limited-information feedback and UCB	Lecture note
25 Feb	Wrapping up UCB; lower bounds	Lecture note
27 Feb	Thompson sampling algorithm	Lecture note Extra note
4 Mar	No class (instructor conflict)
6 Mar	Structured bandits: Linear and Gaussian processes	Lecture note
11 Mar	Contextual bandits and adversarial bandits	Lecture note
13-14 Mar	Take-home midterm
17-21 Mar	No class (spring break)
25 Mar	Dynamic programming and optimal control	Lecture note
27 Mar	Tabular RL with a generative model	Lecture note
1 Apr	Model-based exploration in tabular RL	Lecture note
3 Apr	Value iteration and Q-learning	Lecture note
8 Apr	Policy-based methods	Lecture note
10 Apr	RL with function approximation, theory	Lecture note
15 Apr	RL with function approximation, practice
17 Apr	No class (instructor travel)
22 Apr	LAST DAY OF CLASS: Poster presentations	Coda 9th floor atrium

Homework schedule:

Submission due date and self-grade upload deadline are both 11:59 ET. Submission and self-grade upload will be done via Canvas.

	Rough set of topics	Upload date	Due date	Self-grade due date
Homework 0 (optional)	Review of probability and linear algebra	7 Jan	14 Jan	N/A
Homework 1	Basics of online prediction	17 Jan	4 Feb	17 Feb
Homework 2	Online optimization	6 Feb	21 Feb	28 Feb