Sha 2003 Shallow Parsing with Conditional Random Fields

Citation

Fei, S. and Pereira, F. 2003. Shallow Parsing with Conditional Random Fields. In Proceedings of NAACL.

Online version

An online version of this paper is available [1].

Summary

This paper introduces the analysis of Conditional Random Field in the task of shallow parsing. The author presents the comparison results with previous methods on both accuracy and time efficiency.

Key Contributions

The paper presents the following key findings

The authors claim CRF outperform other models or as good as any reported results in shallow parsing, one of the applications of sequential labeling task

The authors also report that the improved training methods proves great efficiency as shown in their experimental results

Conditional Random Fields

The paper uses a proprietary data set consisting of almost 10,000 voicemail messages with manual transcription and marks. As illustrated in the following excerpt.

<greeting> hi Jane </greeting> <caller> this is Pat Caller </caller> I just wanted to I know you’ve probably seen this or maybe you already know about it . . . so if you could give me a call at <telno> one two three four five </telno> when you get the message I’d like to chat about it hope things are well with you <closing> talk to you soon </closing>

Shallow Parsing

The authors present two set of algorithms on two different sub-tasks, Caller Identification/Phone Number Extraction.

For Caller Identification (Caller Phrase/Caller Name), the authors focus on two targeted variables - the starting position of the Caller Phrase/Caller Name and the length of it. The authors first show empirical distributions of the position and the length, which are both highly skewed. The authors then apply decision tree learner with a small set of features based on common words for learning these two targeted variables. The authors then present the comparison between their results and those from previous work. In fact, the new algorithm performs worse than the previous method on their dataset. However, they do observe a great improvement over previous method on unseen data, which they picked the ASR (automatic speech recognition) results. Thus the authors argue that the previous method with generic named entity recognizer tends to overfit the dataset and the new algorithm is more robust for unseen data. They also transfer this technique to extract Caller Name instead of Caller Phrase.

For Phone Number Extraction, the authors propose a two phrase approach. In the first phrase, the algorithm uses a hand-crafted grammar to propose candidate phone numbers and convert them into numeric presentation. In the second phrase, the algorithm uses a binary classifier to consider the validity of every phone number candidate. In the performance comparison, this rather simple method shows great effectiveness and the authors present a 10% improvement on F-measure over previous method.

Experiments

The Huang et al., 2001 paper discussed a very similar problem but rather with a traditional perspective, it studied three approaches: hand-crafted rules, grammatical inference of sub-sequential transducers and log-linear classifier with bi-gram and tri-gram features, which is essentially the same as in Ratnaparkhi, 1996 paper on Maxent POS tagging.

Sha 2003 Shallow Parsing with Conditional Random Fields

Contents

Citation

Online version

Summary

Key Contributions

Conditional Random Fields

Shallow Parsing

Experiments

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools