Difference between revisions of "Mayfield et al, CoNLL 2003"
(Replaced content with 'Being edited by Rui Correia == Citation == James Mayfield, Paul McNamee, and Christine Piatko. 2003. Named Entity Recognition using Hundreds of Thousands of Features. In P…') |
|||
Line 11: | Line 11: | ||
== Summary == | == Summary == | ||
− | In this [[Category::paper]] | + | In this [[Category::paper]] the authors address the problem of [[Named Entity Recognition]] using [[Support Vector Machines]] to capture transition probabilities in a lattice, a method they called SVM-lattice. Their main goal is to provide a language independent [[Named Entity Recognition] system, considering hundreds of thousands of features, which they will let the SVM decide if are relevant or not. |
+ | |||
+ | In most [[Named Entity Recognition]] systems, handling large numbers of features is expensive and might result in overtraining, demanding for a wise and informed feature selection. The solution proposed by the authors is to built a lattice for each sentence (where the vertexes are the taggs and the edges the possible transitions) and compute the edges transitions probabilities. When these transitions are computed, the authors apply the [[Viterbi]] algorithm to find the best path and decide on the set of tags. | ||
+ | |||
+ | == Brief Description of the Method == | ||
+ | Each sentence is processed individually. A lattice is built for each sentence where each column contains one vertex for each possible tag and is connected by an edge to every vertex in the next column that represents a valid transition. To compute these transitions, the author exploit some important properties of [[SVM]]'s. When these transition are finally estimated and applied to the lattice, the authors run [[Viterbi]] to to find the most likely path | ||
== Brief Description of the Method == | == Brief Description of the Method == |
Revision as of 17:19, 28 September 2011
Being edited by Rui Correia
Contents
Citation
James Mayfield, Paul McNamee, and Christine Piatko. 2003. Named Entity Recognition using Hundreds of Thousands of Features. In Proceedings of CoNLL-2003.
Online version
Summary
In this paper the authors address the problem of Named Entity Recognition using Support Vector Machines to capture transition probabilities in a lattice, a method they called SVM-lattice. Their main goal is to provide a language independent [[Named Entity Recognition] system, considering hundreds of thousands of features, which they will let the SVM decide if are relevant or not.
In most Named Entity Recognition systems, handling large numbers of features is expensive and might result in overtraining, demanding for a wise and informed feature selection. The solution proposed by the authors is to built a lattice for each sentence (where the vertexes are the taggs and the edges the possible transitions) and compute the edges transitions probabilities. When these transitions are computed, the authors apply the Viterbi algorithm to find the best path and decide on the set of tags.
Brief Description of the Method
Each sentence is processed individually. A lattice is built for each sentence where each column contains one vertex for each possible tag and is connected by an edge to every vertex in the next column that represents a valid transition. To compute these transitions, the author exploit some important properties of SVM's. When these transition are finally estimated and applied to the lattice, the authors run Viterbi to to find the most likely path