• Skip to primary navigation
  • Skip to content

Vidya Muthukumar

  • Home
  • Biography
  • Research
  • Publications
  • Teaching/Advising
  • Service
  • Music

ECE 8803: Online Decision Making in Machine Learning (Fall 2022)

Times: Tuesday and Thursday, 2 – 3:15 pm

Location: Whitaker building, room 1103

Instructor: Vidya K Muthukumar

Office Hours: Tuesdays 3:30-4:30 pm (tentative), virtual

Prerequisites: undergraduate probability (ECE3077 or equivalent), undergraduate linear algebra (MATH 2551 or equivalent). Mathematical maturity and familiarity with proof-based arguments will be assumed.

Brief description: In many applications of machine learning (ML), data is collected sequentially; moreover, decisions can impact performance both in the present and the future. This class will deal with the design of ML algorithms for real- time decision making, including reinforcement learning. Classical applications in engineering and modern applications in the ML pipeline will both be discussed, but the focus of the course will be foundational — on understanding design principles and the inner workings of algorithms for online decision-making.

Upon successful completion of this course, students will be able to:

  • Understand and explain the basic design principles of any online algorithm under diverse assumptions on the environment and reward feedback mechanism.
  • Understand how these principles relate to classical concepts in information theory, signal processing, communications, and control theory.
  • Assess the efficacy of an online algorithm for an engineering/machine learning application based on its performance guarantees, tractability of implementation, scalability and assumptions made on the environment.
  • Appreciate how online algorithms relate to other aspects of the machine learning pipeline.

Grading/Format: The course will be graded as follows:

  • Homeworks (top 4/5): 45%
  • Midterm (take-home, tentative date Oct 20-21): 25%
  • Course project: 30%

Piazza/Canvas: The primary mode of interactive communication in this course will be Piazza. Please sign up at the course page, and monitor Piazza for announcements regarding lecture, homeworks, midterm and project. As is standard, we will also use Canvas to keep track of assignments and share resources related to the class.

Resources and schedule

Lecture schedule:

DateTopicResources
23 AugLogistics and introduction
25 AugReview session on probability and basics of MLProbability review
Basics of ML review
30 AugBasics of prediction of an adversarial sequenceLecture note
1 SepThe multiplicative weights algorithmLecture note
6 SepThe multiplicative weights algorithm and decision-making using expert adviceLecture note
8 SepNo-regret through perturbationLecture note
13 and 15 SepNo-regret through perturbation, continuedLecture note
20 SepOnline linear optimization
22 SepOnline convex optimization and stochastic optimization
27 SepOverview of adaptive methods in online learning
29 SepIntroduction to limited-information feedback
4 OctLimited-information feedback and UCB
6 OctLimited-information feedback and UCB, continued
11 OctWrapping up UCB; informal discussion of lower bound
13 OctThompson sampling algorithm, Part 1
20 Oct (asynchronous due to midterm)Thompson sampling algorithm, Part 2
25 OctStructured bandits: Linear and Gaussian processes
27 OctContextual bandits 
1 NovDynamic programming and optimal control
3 NovTabular RL with a generative model
8 NovModel-based exploration in tabular RL
10 NovValue iteration and Q-learning
15 Nov Policy-based methods
17 Nov An overview of RL theory with function approximation 
22 NovBonus lecture: RL and function approximation in practice
29 Nov (asynchronous due to instructor travel)Online learning and zero-sum game theory
1 Dec (asynchronous due to instructor travel)Online learning and non-zero-sum game theory
6 DecLAST DAY OF CLASS: Poster presentations

Homework schedule:

Submission due date and self-grade upload deadline are both 11:59 ET. Submission and self-grade upload will be done via Canvas.

 Rough set of topicsUpload dateDue dateSelf-grade due date
Homework 0 (optional)Review of probability and linear algebra23 Aug30 AugN/A
Homework 1Fundamentals of adversarial prediction1 Sep14 Sep21 Sep

Copyright © 2025 · eleven40 Pro on Genesis Framework · WordPress · Log in

  • Home
  • Biography
  • Research
  • Publications
  • Teaching/Advising
  • Service
  • Music