Difference between revisions of "Paper:Collins, ACL 2002"

Revision as of 00:47, 26 September 2011

Citation

Ranking Algorithms for Named-Entity Extraction: Boosting and the Voted Perceptron, Collins, ACL 2002

Online Version

Here is the online version of the paper.

Summary

In this paper the author describes two algorithms, wich rerank the top N hypotheses from a maximum entropy tagger, and apply them to the task of recovering named-entity boundaries. Since the task at hand can be framed as a tagging task - to tag each word as being either the start of an entity (S), a continuation of an entity (C), or not to be part of ab entity at all (N), the author uses state of the art maximum entropy tagger similar to the one described in Ratnaparkhi EMNLP 1996 as a baseline model. The same Max-ent tagger is used to generate the N=20 most probable segmentations for each input sentence, along with their probabilities. Author's aim is to come up with reranking strategies for the test data candidates in order to improve the precision and recall.

The author considers various global features for each candidate tagged sequence. Most of these features are anchored on entity boundaries in the candidate segmentation. Each candidate tagged sequence $x$ , proposed by the Max-ent tagger, is represented by the log probability $L(x)$ from the tagger, as well as the values of the global features $h_{s}(x)$ for $s=1$ ... $m$ , where $m$ are the no. of global features. The two algorithms described in the next section blend these two sources of information (global features and log probability) to improve upon a strategy which just takes the candidate from the tagger with the highest score for $L(x)$ .

The author has shown that existing reranking methods - boosting and voted perceptron, are useful for a new domain, named-entity tagging. Another contribution is the suggestion of various global features of the candidate segmentations which give improvements on this task.

Brief description of the methods

Pairs of the form $\{s_{i},t_{i}\}$ , where each $s_{i}$ is a sentence and each $t_{i}$ is the correct tag sequence for that sentence, serve as the training data. $C(s_{i})=\{x_{i1},x_{i2}...\}$ denotes the set of candidates (top N outputs from the Max-ent tagger) for each $s_{i}$ , where $x_{ij}$ is the $j'$ th candidate of $i'$ th sentence. $Q(x_{ij})$ is the probability that the base model assigns to $x_{ij}$ . Hence $L(x_{ij})$ =log $Q(x_{ij})$ . The set of $m$ global features are represented as $h_{s}(x)$ for $s=1$ ... $m$ . The parameters of the model are a vector of $m+1$ parameters, $w=\{w_{0},w_{1}...w_{m}\}$ . The ranking function is given as

$F(x,w)=w_{0}L(x)+\sum \limits _{s=1}^{m}w_{s}h_{s}(x)$

The ranking function can also be written as $F(x,w)=w.h(x)$ if $h(x)$ is defined as $\{L(x),h_{1}(x)\ldots h_{m}(x)\}$ .

Boosting

Methods: Boosting, Voted Perceptron

@@ Line 15: / Line 15: @@
 <math>F(x,w)=w_0L(x)+\sum\limits_{s=1}^{m} w_sh_s(x)</math>
-The ranking function can also be written as <math>F(x,w)=w.h(x)</math> if <math>h(x)</math> is defined as <math>\{L(x),h_1(x) \ldots h_m(x)\}</math>
+The ranking function can also be written as <math>F(x,w)=w.h(x)</math> if <math>h(x)</math> is defined as <math>\{L(x),h_1(x) \ldots h_m(x)\}</math>.
+=== Boosting ===
 Methods: [[Boosting]], [[Voted Perceptron]]
 == Experimental Result ==
 == Related papers ==

Difference between revisions of "Paper:Collins, ACL 2002"

Revision as of 00:47, 26 September 2011

Contents

Citation

Online Version

Summary

Brief description of the methods

Boosting

Experimental Result

Related papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools