Difference between revisions of "10-601 Markov Decision Processes and Reinforcement Learning"
From Cohen Courses
Jump to navigationJump to search (Created page with " === Slides === * [http://curtis.ml.cmu.edu/w/courses/images/d/dc/Lecture22-SN.pdf Slides in PDF] * Also: [http://www.cs.cmu.edu/~wcohen/10-601/AWS.pptx Slides about Amazon W...") |
(No difference)
|
Revision as of 10:19, 1 December 2013
Slides
Readings
Taking home message
- What distinguishes RL from Supervised Learning and Unsupervised Learning?
- Elements of a Markov Decision Process
- Both value iteration and policy iteration are standard algorithms for solving MDPs, and there isn't currently universal agreement over which algorithm is better.
- Q-learning is model-free, and explore the temporal difference