Perez-Cruz and Ghahramani 2007 Conditional graphical models

From Cohen Courses
Revision as of 02:08, 11 October 2011 by Jmflanig (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Citation and Online Link

Conditional graphical models, Perez-Cruz & Ghahramani, 2007, in Predicting Structured Data. MIT Press, Cambridge, MA, USA, pp. 265-282..

Summary

In this paper the authors propose a generalization of CRF-like algorithms called Conditional Graphical Models (GCM) which allows any multi-class learning algorithm to be used on the cliques in the graph. The first half of the paper is a nice review of the many CRF-like algorithms that existed beforehand. They put all these algorithms into a unified mathematical framework, and show that the learning procedures are solving a convex optimization problem. They show that they can simplify this convex optimization problem. This simplification allows them to use any multi-class learning algorithm (like SVMs) on the cliques in the graph. In addition, the optimization problem during training is easier, and decoding is identical to normal CRFs.

Method

MAP training of Conditional Random Fields (CRFs) can be cast into the form:

Experimental Result

They evaluated their method on a handwritten-word recognition task. The dataset was 6877 words of length 4 to 15 letters, written by 150 different people. The words were represented as a sequence of 16 by 8 pixel binary images (one for each letter), paired with the correct label for each letter.

They ran experiments on three different graphical models, all of them using SVMs on the cliques. The first classified each letter independently, the second was a linear chain (like a normal CRF), and the third included three letter interactions. This last model was only possible in CGM because the training is much less computationally expensive. This model improved upon the baseline (M3N aka Max Margin Markov Network) by over 50% for the letter recognition task (13% to 5.8% error rate).

Related Papers

The CRF and CRF-like methods they put into a unified framework include Lafferty 2001 Conditional Random Fields,