Morante et al., 2010
Contents
Citation
Roser Morante, Vincent Van Asch, and Walter Daelemans. Memory-Based Resolution of In-Sentence Scopes of Hedge Cues. In Proceedings of the 2010 Conference on Computational Natural Language Learning. Online Link
Summary
The 2010 CoNLL Shared Task had two components: a binary classification task on per-sentence uncertainty labeling, and a more structured problem: scope detection of hedge cues in a given uncertain sentence. The latter of these categories is more directly relevant to the Structured Prediction course. This paper was the most successful entrant to that second, structured task.
Motivation
Task 1: Uncertain Sentence Classification
The first half of the CoNLL 2010 Shared Task was a binary classification problem, not a structured prediction problem, so we do not go into great detail here. The task is, given a sentence, determine whether that sentence is uncertain or not. Classification in this paper was done by predicting the uncertainty of each word; then, in postprocessing, a sentence was marked as uncertain if it contains 5% or more words that were classified individually as uncertain.
The learning step used a standard SVM with a polynomial kernel. Features represention for each sentence contained surface form, lemmatized stem, part of speech and dependency information, and similar features about the four words in the immediate context of the word (two ahead and two behind), and two features based on a vocabulary list of cues.
This system results in performance of F-score 57 (precision 81, recall 44) on detecting Wikipedia weasel sentences, and F-score 82 (precision 81, recall 82) at detecting hedged sentences in biomedical scientific literature.