Apappu writeup of Sha and Pereira

From Cohen Courses
Jump to navigationJump to search

This is a review of the paper Sha_2003_shallow_parsing_with_conditional_random_fields by user:Apappu.

shallow parsin with CRF

  • This paper focuses on evaluation and discussion of training a CRF model to improve the performance on shallow parsing
  • CRFs bring best of sequential classifiers (statistically correlated features) and generative models(trade off decisions at differeqent sequ posi).
  • A CRF on (X,Y) f of local features and W.

Each local feat : state or transitonh feature.

  • Authors state that iterative scaling algorithms for CRF training is much slower when local

features have been used therefore they have tried out conjugate gradient and second order methods for training. Whereas, L-BFGS perform comparably on large corpus.

  • preconditioner : improve the condition number of the quadratic form
  • I didn't understand how Preconditional CG actually works.


  • voted perceptron , reduces overfitting considerably.


[using different training methods and comparison]


All features were used in training and restion.


(but then on CoNLL training set, they have used only those features that are fired atleast once)

for their highest F-score they have used all features.


(negative weights on transitions )

tuning: gaussian weight prior

they used developmenet set to identify prior and no. of training iterations


convergence speed GIS << Mixed CG < CG [approximate second order info]