Apappu writeup of Sha and Pereira
This is a review of the paper Sha_2003_shallow_parsing_with_conditional_random_fields by user:Apappu.
shallow parsin with CRF
- This paper focuses on evaluation and discussion of training a CRF model to improve the performance on shallow parsing
- CRFs bring best of sequential classifiers (statistically correlated features) and generative models(trade off decisions at differeqent sequ posi).
- A CRF on (X,Y) f of local features and W.
Each local feat : state or transitonh feature.
- Authors state that iterative scaling algorithms for CRF training is much slower when local
features have been used therefore they have tried out conjugate gradient and second order methods for training. Whereas, L-BFGS perform comparably on large corpus.
- preconditioner : improve the condition number of the quadratic form
- I didn't understand how Preconditional CG actually works.
- voted perceptron , reduces overfitting considerably.
[using different training methods and comparison]
All features were used in training and restion.
(but then on CoNLL training set, they have used only those features that are fired atleast once)
for their highest F-score they have used all features.
(negative weights on transitions )
tuning: gaussian weight prior
they used developmenet set to identify prior and no. of training iterations
convergence speed GIS << Mixed CG < CG [approximate second order info]