Instructor/Organizer: Huizhen
Janey Yu (janey.yu@cs.helsinki.fi)
Time and Location: Tue 16-18, Exactum C221
This weekly seminar is about analytical and computational methods for discrete-time Markov decision processes (MDP), a general class of models for sequential decision making under uncertainty. We will study the fundamentals of MDPs as well as the more recent research works, which include approximation methods for large scale problems, simulation-based reinforcement learning methods, and partially observable Markov decision processes (POMDP). Our focus will be on MDPs with finite space models and the expected cost criteria.
The intention of the seminar in general is to introduce the audience to the stochastic dynamic programming methodology, to the extent that the audience will be able to analyze, apply, or develop such methods for their own area of interest.
A tentative syllabus is as follows:
The development of the theory of MDPs is fairly linear: new analyses and methods build upon ones preceding them. For this reason, we will have lectures covering some of the fundamentals during the first 1/3 of the seminar, meanwhile the participants will have time to absorb and practice the ideas, and to decide which topic they would like to investigate for their own presentations. Later in the seminar, we will also have introductory lectures whenever we start a different topic.
The requirements for the participants are a term paper and a presentation. For presentation, each participant can either prepare a lecture (possibly jointly with another participant), or prepare a presentation on articles. For term paper, participants are encouraged to apply the methods to a problem of their own choice; the contents of the paper can include modeling issues, analysis, or computational results, and should demonstrate understanding of the subject of the seminar.
We will have lecture notes in addition to the references.
Reference books:
[1] D. P. Bertsekas. Dynamic Programming and Optimal Control, Athena Scientific, 2005. (Slides on some materials of the book are available online.)
[2] M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley series in probability and mathematical statistics, John Wiley & Sons, Inc., 1994.
Articles:
A tentative selection of articles for presentations >>
05/09 Lecture 1: Introduction
12/09 Prof. D. P. Bertsekas: Guest lecture on Neuro-Dynamic Programming
19,26/09 Lecture 2: Finite-horizon problems
Student presentation on the Sequential Assignment Problem
03/10 Lecture 3: Infinite-horizon discounted problems (I)
10/10 Lecture 4: Infinite-horizon discounted problems (II)
31/10 Lecture 5: Markov chains and classification of MDP based on chain structure
07/11 Petteri Nurmi: Guest lecture on Game Theory and Reinforcement Learning with Applications to Routing in Ad Hoc Networks
21/11 Prof. Jyrki Kivinen: Guest lecture on Combining Expert Advice
Lecture 6: Selected Topics on Average Cost Problems (I)
28/11 Student presentation on Robust Markov Decision Processes with Uncertain Transition Matrices
Student presentation on Semi-Markov Decision Processes
30/11 Student presentation on Inverse Reinforcement Learning
Student presentation on the SEARN Algorithm
07/12 Student Presentation on Linear Programming in Discounted Markov Decision Problems
Student presentation on the Linear Programming Approach to Approximate Dynamic Programming
Student presentation on Review of Reinforcement Learning Methods
Updated in December, 2006.