Difference between revisions of "Daume and Marcu 2005 Learning as Search Optimization: Approximate Large Margin Methods for Structured Prediction"
(→Method) |
|||
Line 15: | Line 15: | ||
[[File:LaSO Generic Search.png]] | [[File:LaSO Generic Search.png]] | ||
− | The ''enqueue'' function puts the nodes onto the queue in some order. Depending on the order that the ''enqueue'' function puts nodes on the queue, you can get depth-first, breadth-first, beam, heuristic, A*, etc search algorithms from standard AI textbooks. | + | The ''enqueue'' function puts the nodes onto the queue in some order. Depending on the order that the ''enqueue'' function puts nodes on the queue, you can get depth-first, breadth-first, beam, heuristic, A*, etc search algorithms from standard AI textbooks. |
+ | |||
+ | In LaSO ''enqueue'' ranks nodes according to a function ''g'' which is a linear in the set of features. The features can depend on the input ''x'' and the path to the current current node ''n'': | ||
<math>g=\mathbf{\omega}^\top \mathbf{\Phi}(x,n)</math> | <math>g=\mathbf{\omega}^\top \mathbf{\Phi}(x,n)</math> | ||
− | LaSO | + | LaSO learns the feature weights w from the training examples. The learning algorithm is shown below: |
[[File:LaSO Algorithm.png]] | [[File:LaSO Algorithm.png]] |
Revision as of 03:46, 1 October 2011
Citation and Link
Learning as Search Optimization: Approximate Large Margin Methods for Structured Prediction
Summary
The authors present the Learning as Search Optimization (LaSO) framework. The algorithm is basically SEARN but analyzed differently (and also ~24 pages shorter).
LaSO attempts to combine the learning of the model with the search that occurs during decoding. Instead of learning the model and then doing a search during decoding, LaSO attempts to directly learn to search.
Method
The generic search (decoding) algorithm is shown below:
The enqueue function puts the nodes onto the queue in some order. Depending on the order that the enqueue function puts nodes on the queue, you can get depth-first, breadth-first, beam, heuristic, A*, etc search algorithms from standard AI textbooks.
In LaSO enqueue ranks nodes according to a function g which is a linear in the set of features. The features can depend on the input x and the path to the current current node n:
LaSO learns the feature weights w from the training examples. The learning algorithm is shown below:
Nodes for which there exists a path to the optimal goal are called "y-good" nodes. The siblings function denotes the set of nodes at the same depth as current node that can reach the goal (i.e. are y-good). The algorithm may have to backtrack to find them if they are not currently in the queue.
If the search makes a mistake, the weights are updated with the function update. The two functions they propose in the paper are the perceptron update (shown below) and a variant of the approximate large margin update (ALMA).
Experimental Result
Related Papers
In progress by User:Jmflanig