10-601 Markov Decision Processes and Reinforcement Learning

From Cohen Courses
Revision as of 09:24, 1 December 2013 by Pxie1 (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Slides

Readings

Reinforcement Learning

Taking home message

  • What distinguishes RL from Supervised Learning and Unsupervised Learning?
  • Elements of a Markov Decision Process
  • Both value iteration and policy iteration are standard algorithms for solving MDPs, and there isn't currently universal agreement over which algorithm is better.
  • Q-learning is model-free, and explore the temporal difference