Difference between revisions of "J. Artiles et al. EMNLP 2009"

Revision as of 01:10, 31 October 2010

Citation

Javier Artiles, Enrique Amigó & Julio Gonzalo, The role of named entities in web people search, in EMNLP 2009

Online version

The role of named entities in web people search

Summary

This paper tries to determine the role of a number of features on solving Web People Search clustering problem. The paper focused on the role of NE in this task. In order to compare different features, they reformulated this clustering problem into a classification problem such that each pair of documents will be classified as coreferent if they share the same cluster or not coreferent, Otherwise.

The major contribution of this paper is to introduce Maximal Pairwise Accurary (MPA) measure that is an upper bound score for a combination of features regardless of the underlying machine learning algorithms used and parameter settings.

For experiments, they used two standard datasets for Web People Search Systems: WePS-1 and WePS-2. They concluded

NEs do not improve the clustering when compared with a combination of simpler features

such as local, global and snippet tokens, n-grams, etc.

results are sensitive to the NER system used.

MPA

Given a feature set $X={x_{1},x_{2},\dots ,x_{n}}$ The intuition of this score is that if at least one feature gives correct information, then the perfect algorithm would produce a correct output.

Difference between revisions of "J. Artiles et al. EMNLP 2009"

Revision as of 01:10, 31 October 2010

Contents

Citation

Online version

Summary

MPA

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 16: / Line 16: @@
 if they share the same cluster or not coreferent, Otherwise.
-The major contribution of this paper is to introduce Maximal Pairwise Accurary measure
+The major contribution of this paper is to introduce Maximal Pairwise Accurary (MPA) measure
 that is an upper bound score for a combination of features
 regardless of the underlying machine learning algorithms used and parameter settings.
@@ Line 26: / Line 26: @@
 such as local, global and snippet tokens, n-grams, etc.
 # results are sensitive to the NER system used.
+== MPA ==
+Given a feature set <math> X = {x_{1}, x_{2}, \dots, x_{n} }</math>
+The intuition of this score is that if at least one feature gives correct information, then the perfect algorithm
+would produce a correct output.