Perez-Cruz and Ghahramani 2007 Conditional graphical models
Citation and Online Link
Summary
In this paper the authors propose a generalization of CRF-like algorithms called Conditional Graphical Models (GCM) which allows any multi-class learning algorithm to be used on the cliques in the graph. The first half of the paper is a nice review of the many CRF-like algorithms that existed beforehand. They put all these algorithms into a unified mathematical framework, and show that the learning procedures are solving a convex optimization problem. They show that they can simplify this convex optimization problem. This simplification allows them to use any multi-class learning algorithm (like SVMs) on the cliques in the graph. In addition, the optimization problem during training is easier, and decoding is identical to normal CRFs.
Method
MAP training of Conditional Random Fields (CRFs) can be cast into the form:
Experimental Result
They evaluated their method on a handwritten-word recognition task. The dataset was 6877 words of length 4 to 15 letters, written by 150 different people. The words were represented as a sequence of 16 by 8 pixel binary images (one for each letter), paired with the correct label for each letter.
They ran experiments on three different graphical models, all of them using SVMs on the cliques. The first classified each letter independently, the second was a linear chain (like a normal CRF), and the third included three letter interactions. This last model was only possible in CGM because the training is much less computationally expensive. This model improved upon the baseline (M3N aka Max Margin Markov Network) by over 50% for the letter recognition task (13% to 5.8% error rate).
Related Papers
The CRF and CRF-like methods they put into a unified framework include Lafferty 2001 Conditional Random Fields,