Difference between revisions of "10-601 Markov Decision Processes and Reinforcement Learning"

Revision as of 10:19, 1 December 2013

What distinguishes RL from Supervised Learning and Unsupervised Learning?
Elements of a Markov Decision Process
Both value iteration and policy iteration are standard algorithms for solving MDPs, and there isn't currently universal agreement over which algorithm is better.
Q-learning is model-free, and explore the temporal difference