• Skip to primary navigation
  • Skip to content

Vidya Muthukumar

  • Home
  • Biography
  • Research
  • Publications
  • Teaching/Advising
  • Service
  • Music

ECE 8803: Online Decision Making in Machine Learning (Fall 2021)

Times: Monday and Wednesday, 9:30 – 10:45 am

Location: Scheller College of Business, room 223

Instructor: Vidya K Muthukumar

Office Hours: 12-1 pm, virtual

Prerequisites: undergraduate probability (ECE3077 or equivalent), undergraduate linear algebra (MATH 2551 or equivalent). Mathematical maturity and familiarity with proof-based arguments will be assumed.

 

Brief description: In many applications of machine learning (ML), data is collected sequentially; moreover, decisions can impact performance both in the present and the future. This class will deal with the design of ML algorithms for real- time decision making, including reinforcement learning. Classical applications in engineering and modern applications in the ML pipeline will both be discussed, but the focus of the course will be foundational — on understanding design principles and the inner workings of algorithms for online decision-making.

Upon successful completion of this course, students will be able to:

  • Understand and explain the basic design principles of any online algorithm under diverse assumptions on the environment and reward feedback mechanism.
  • Understand how these principles relate to classical concepts in information theory, signal processing, communications, and control theory.
  • Assess the efficacy of an online algorithm for an engineering/machine learning application based on its performance guarantees, tractability of implementation, scalability and assumptions made on the environment.
  • Appreciate how online algorithms relate to other aspects of the machine learning pipeline.

Grading/Format: The course will be graded as follows:

  • Homeworks (top 5/6): 45%
  • Midterm (take-home, Oct 21-22): 25%
  • Course project: 30%

 

Piazza/Canvas: The primary mode of interactive communication in this course will be Piazza. Please sign up at the course page, and monitor Piazza for announcements regarding lecture, homeworks, midterm and project. As is standard, we will also use Canvas to keep track of assignments and share resources related to the class.

 

Resources and schedule

Lecture schedule:

Date Topic Resources
23 Aug Logistics and introduction Slides
25 Aug Discussion on probability/linear algebra

Review note on probability

Review note on linear algebra

30 Aug Basics of prediction of an adversarial sequence Lecture note
1 Sep The multiplicative weights algorithm and “no regret” Lecture note
8 Sep No-regret through perturbation Lecture note (also for 13 Sep)
13 Sep No-regret through perturbation, continued See above.
15 Sep From prediction to decision-making: Online linear optimization Lecture note
20 Sep Follow-the-Regularized-Leader, Introduction to online convex optimization Lecture note
22 Sep Online convex optimization and stochastic optimization Lecture note
27 Sep Overview of adaptive methods in online learning Lecture note
29 Sep Introduction to limited-information feedback Lecture note
4 Oct Limited-information feedback and UCB, Part 1 Lecture note
6 Oct Limited-information feedback and UCB, Part 2 Lecture note
13 Oct UCB and informal discussion of lower bound Lecture note
18 Oct Thompson sampling algorithm, Part 1 Lecture note
20 Oct Thompson sampling algorithm, Part 2 Lecture note
25 Oct Structured bandits: Linear and Gaussian processes Lecture note
27 Oct Contextual bandits
1 Nov Dynamic programming and optimal control Lecture note
3 Nov Tabular RL with a generative model Lecture note
8 Nov Model-based exploration in tabular RL Lecture note
10 Nov Value iteration and Q-learning Lecture note
15 Nov (virtual) Policy-based methods Lecture note
17 Nov (virtual) Optional general-audience video-listening
22 Nov An overview of RL with function approximation Lecture note
29 Nov Online learning and zero-sum game theory Lecture note
1 Dec Online learning and non-zero-sum game theory Lecture note
6 Dec LAST DAY OF CLASS: Poster presentations N/A

Homework schedule:

Submission due date and self-grade upload deadline are both 11:59 ET. Submission and self-grade upload will be done via Canvas.

Rough set of topics Upload date Due date Self-grade due date
Homework 1 Fundamentals of adversarial prediction 1 Sep 14 Sep 21 Sep
Homework 2 From prediction to decision-making 15 Sep 28 Sep 5 Oct
Homework 3 Online optimization, introduction to bandits 1 Oct 18 Oct 24 Oct
Homework 4 Bayesian and structured bandits 27 Oct 10 Nov 17 Nov
Homework 5 Reinforcement learning and optimal control 11 Nov 24 Nov 6 Dec

 

 

 

Copyright © 2025 · eleven40 Pro on Genesis Framework · WordPress · Log in

  • Home
  • Biography
  • Research
  • Publications
  • Teaching/Advising
  • Service
  • Music