Difference between revisions of "10-601 Markov Decision Processes and Reinforcement Learning"

Latest revision as of 10:24, 1 December 2013

What distinguishes RL from Supervised Learning and Unsupervised Learning?
Elements of a Markov Decision Process
Both value iteration and policy iteration are standard algorithms for solving MDPs, and there isn't currently universal agreement over which algorithm is better.
Q-learning is model-free, and explore the temporal difference

@@ Line 2: / Line 2: @@
 === Slides ===
-* [http://curtis.ml.cmu.edu/w/courses/images/d/dc/Lecture22-SN.pdf Slides in PDF]
+* [http://curtis.ml.cmu.edu/w/courses/images/1/1b/Lecture24-RL.pdf Slides in PDF]
-* Also: [http://www.cs.cmu.edu/~wcohen/10-601/AWS.pptx Slides about Amazon Web Services]
 === Readings ===