Difference between revisions of "Morante and Daelemans CoNLL 2009"

From Cohen Courses
Jump to navigationJump to search
(Created page with '== Citation == Jiao, F., Wang, S., Lee, C.H., Greiner, R., and Schuurmans, D. Semi-supervised conditional random fields for improved sequence segmentation and labeling. Proceed…')
 
Line 1: Line 1:
 
== Citation ==  
 
== Citation ==  
  
Jiao, F., Wang, S., Lee, C.H., Greiner, R., and Schuurmans, D. Semi-supervised conditional random fields for improved sequence segmentation and labeling. Proceedings of the 21st International Conference on Computational Linguistics. (2006) 209-216.
+
Morante, R. and Daelemans, W. A metalearning approach to processing the scope of negation. Proceedings of the Thirteenth Conference on Computational Natural Language Learning (2009).
  
 
== Online Version ==
 
== Online Version ==
  
http://acl.ldc.upenn.edu/P/P06/P06-1027.pdf
+
[[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.164.3657&rep=rep1&type=pdf#page=37 citeseer]]
  
 
== Summary ==  
 
== Summary ==  
  
This paper presented a novel [[Category::method]] to using a [[UsesMethod::CRF]] in a [[UsesMethod::semi-supervised learning]] setting. HMMs and other generative models easily incorporate unlabeled data using EM, but have difficulty with non-independent features. Semi-supervised discriminative approaches were less well explored. By incorporating extra data, the new technique improves the accuracy over a baseline CRF trained just on labeled data. In tandem, the authors developed an efficient dynamic programming algorithm to calculate a covariance matrix of features, something necessary to calculate the gradient and perform iterative ascent.
 
  
The key idea is to minimize the conditional entropy of the unlabeled data, thereby maximizing the certainty of the labellings and reinforcing the supervised labels. Equivalently, this is like maximizing the KL divergence, making two distributions "farther" apart or decreasing their overlap.  
+
This paper attempts to tackle a novel [[Category::problem]] of [[AddressesProblem::determining the precise scope of a negation term]] in biomedical text. Previous papers had shown that for a given medical concept (like a patient's disease), its negation status could be assigned with high accuracy, but none had examined, for a given negation term, determining its precise scope. Negation (along with other context cues such as uncertainty) are crucial to proper IE of medical records, and increasingly for other applicatons such as sentiment detection.
  
The optimization criterion is to maximize the sum of the conditional likelihood of the labeled samples and the negative conditional entropy of the unlabeled examples, along with regularization. This extra entropy term leads to a non-concave optimization function. However, one can still attempt to improve on a fully supervised CRF by using its learned parameter values as the starting point of an L-BFGS algorithm.  
+
The approach proceeded in two phases: negative term identification and then scope identification. Negative tokens were tagged as either beginning, inside, or outside a negative signal using an [[UsesMethod:information gain decision tree]] with local features. Some words are unambiguously negative ("no", "lack", "absent" etc) and simply automatically assigned as negative.
 +
 
 +
Scope was decided by classifying tokens as either being the first element of the scope, last, or neither by 3 classifiers: a [[UsesMethod:kNN]], [[UsesMethod:SVM]], and [[UsesMethod:CRF]]. A CRF [[UsesMethod:meta-classifier]] takes these results and more features to assign final scope tags.
 +
 
 +
Determining scope accurately turns out to be a fairly difficult task, with a PCS measure (whole scope is correct or not) ranging from 0.40 to 0.70 depending on the data set. F1 for token by token were better, from 0.70 to 0.85. This does not compare too well to the regex-based, medical concept negation classifiers mentioned above (F1 ~0.95), but that is somewhat apples to oranges.
  
An experiment on named entity recognition of gene names resulted in generally much improved recall and F-measures.
 
  
  
 
== Related Papers ==
 
== Related Papers ==
  
This form of minimum entropy regularization was first explored by [[RelatedPaper::Grandvalet and Bengio, ANIPS 2004]] for a single, unstructured, variable.
+
Regex and rules based negation classifiers frequently used in the clinical domain include [[RelatedPaper::Chapman et al J Biomed Inform 2001]], [[RelatedPaper::Mutalik et al J Am Med Inform Assoc 2001]], and [[RelatedPaper::Elkin et al BMC Medical Informatics and Decision Making 2005]].
  
CRFs were first proposed by [[RelatedPaper::Lafferty et al, ICML 2001]].
+
[[RelatedPaper::Councill et al Workshop on Negation and Speculation in NLP 2010]] followed up with a simpler CRF model that performs well also on sentiment analysis.
  
The dataset analyzed was from [[UsesDataset::McDonald et al 2005]].
+
The dataset consisted of clinical reports, biomedical texts, and biomedical abstracts annotated for various scopes by [[UsesDataset::Vincze et al BMC Bioinformatics 2008]].

Revision as of 01:18, 1 October 2010

Citation

Morante, R. and Daelemans, W. A metalearning approach to processing the scope of negation. Proceedings of the Thirteenth Conference on Computational Natural Language Learning (2009).

Online Version

[citeseer]

Summary

This paper attempts to tackle a novel problem of determining the precise scope of a negation term in biomedical text. Previous papers had shown that for a given medical concept (like a patient's disease), its negation status could be assigned with high accuracy, but none had examined, for a given negation term, determining its precise scope. Negation (along with other context cues such as uncertainty) are crucial to proper IE of medical records, and increasingly for other applicatons such as sentiment detection.

The approach proceeded in two phases: negative term identification and then scope identification. Negative tokens were tagged as either beginning, inside, or outside a negative signal using an UsesMethod:information gain decision tree with local features. Some words are unambiguously negative ("no", "lack", "absent" etc) and simply automatically assigned as negative.

Scope was decided by classifying tokens as either being the first element of the scope, last, or neither by 3 classifiers: a UsesMethod:kNN, UsesMethod:SVM, and UsesMethod:CRF. A CRF UsesMethod:meta-classifier takes these results and more features to assign final scope tags.

Determining scope accurately turns out to be a fairly difficult task, with a PCS measure (whole scope is correct or not) ranging from 0.40 to 0.70 depending on the data set. F1 for token by token were better, from 0.70 to 0.85. This does not compare too well to the regex-based, medical concept negation classifiers mentioned above (F1 ~0.95), but that is somewhat apples to oranges.


Related Papers

Regex and rules based negation classifiers frequently used in the clinical domain include Chapman et al J Biomed Inform 2001, Mutalik et al J Am Med Inform Assoc 2001, and Elkin et al BMC Medical Informatics and Decision Making 2005.

Councill et al Workshop on Negation and Speculation in NLP 2010 followed up with a simpler CRF model that performs well also on sentiment analysis.

The dataset consisted of clinical reports, biomedical texts, and biomedical abstracts annotated for various scopes by Vincze et al BMC Bioinformatics 2008.