Difference between revisions of "Ratliff, AISTATS 2007"

From Cohen Courses
Jump to navigationJump to search
Line 25: Line 25:
 
Ladar scan classification: compared to CPLEX's intensive memory requirement, the subgradient method has linear memory requirement.  
 
Ladar scan classification: compared to CPLEX's intensive memory requirement, the subgradient method has linear memory requirement.  
  
Value Function Approximation: it tries to estimate the cost-to-go  
+
Value Function Approximation: it tries to estimate the cost-to-go of running a policy from any state.
 
 
== Related papers ==
 

Revision as of 16:49, 25 September 2011

Citation

Nathan Ratliff, J. Andrew (Drew) Bagnell, and Martin Zinkevich, "(Online) Subgradient Methods for Structured Prediction," Eleventh International Conference on Artificial Intelligence and Statistics (AIStats), March, 2007.

Online version

[1]

Summary

In this paper authors propose simple subgradient based technique for optimizing a regularized risk formulation of structured learning problems in both online and batch setting. The paper also analyse the theoretical convergence, generalization, and robustness properties of the technique. They presented results on three problems: Optical character recognition, Ladar scan classification and value function approximation.

Brief description of the method

The method develop an gradient based approach to structured learning using regularized risk formulation of maximum margin structured learning (MMSL) derived by placing the constraints into the objective to create a convex function in w. This objective is then optimized by a direct generalization of gradient descent, called the subgradient method.

The subgradient iteratively computes and steps in the negative direction of a gradient-like vector. After calculating the exact subgradients for single term of the Maximum Margin Structured Classification (MMSC) and Maximum Margin Structured Regression (MMSR)regularized risk function, following the negative of these subgradient is intuitive. The algorithm decrease the score if it is too high and increase it if it is too low. The theoretical analysis is done in the paper.

Experimental Result

Experiments are done on three task.

Optical Character Recognition: The method is almost equal to the previous method but it is quite fast. It took 17 seconds to run the 10 fold cross validation.

Ladar scan classification: compared to CPLEX's intensive memory requirement, the subgradient method has linear memory requirement.

Value Function Approximation: it tries to estimate the cost-to-go of running a policy from any state.