Difference between revisions of "Morante and Daelemans CoNLL 2009"
PastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 10: | Line 10: | ||
− | This paper attempts to tackle a novel | + | This [[Category::paper]] attempts to tackle a novel problem of [[AddressesProblem::scoping negation terms|determining the precise scope of a negation term]] in biomedical text. Previous papers had shown that for a given medical concept (like a patient's disease), its negation status could be assigned with high accuracy, but none had examined, for a given negation term, determining its precise scope. Negation (along with other context cues such as uncertainty) are crucial to proper IE of medical records, and increasingly for other applicatons such as sentiment detection. |
− | The approach proceeded in two phases: negative term identification and then scope identification. Negative tokens were tagged as either beginning, inside, or outside a negative signal using an [[UsesMethod: | + | The approach proceeded in two phases: negative term identification and then scope identification. Negative tokens were tagged as either beginning, inside, or outside a negative signal using an information gain [[UsesMethod::decision tree]] with local features. Some words are unambiguously negative ("no", "lack", "absent" etc) and simply automatically assigned as negative. |
− | Scope was decided by classifying tokens as either being the first element of the scope, last, or neither by 3 classifiers: a [[UsesMethod::kNN]], [[UsesMethod::SVM]], and [[UsesMethod::CRF]]. A CRF [[UsesMethod:meta-classifier]] takes these results and more features to assign final scope tags. | + | Scope was decided by classifying tokens as either being the first element of the scope, last, or neither by 3 classifiers: a [[UsesMethod::kNN]], [[UsesMethod::SVM]], and [[UsesMethod::CRF]]. A CRF [[UsesMethod::meta-classifier]] takes these results and more features to assign final scope tags. |
Determining scope accurately turns out to be a fairly difficult task, with a PCS measure (whole scope is correct or not) ranging from 0.40 to 0.70 depending on the data set. F1 for token by token were better, from 0.70 to 0.85. This does not compare too well to the regex-based, medical concept negation classifiers mentioned above (F1 ~0.95), but that is somewhat apples to oranges. | Determining scope accurately turns out to be a fairly difficult task, with a PCS measure (whole scope is correct or not) ranging from 0.40 to 0.70 depending on the data set. F1 for token by token were better, from 0.70 to 0.85. This does not compare too well to the regex-based, medical concept negation classifiers mentioned above (F1 ~0.95), but that is somewhat apples to oranges. | ||
Line 26: | Line 26: | ||
[[RelatedPaper::Councill et al Workshop on Negation and Speculation in NLP 2010]] followed up with a simpler CRF model that performs well also on sentiment analysis. | [[RelatedPaper::Councill et al Workshop on Negation and Speculation in NLP 2010]] followed up with a simpler CRF model that performs well also on sentiment analysis. | ||
− | The dataset consisted of clinical reports, biomedical texts, and biomedical abstracts annotated for various scopes by [[ | + | The dataset consisted of clinical reports, biomedical texts, and biomedical abstracts annotated for various scopes by [[RelatedPaper::Vincze et al BMC Bioinformatics 2008]]. |
Latest revision as of 17:08, 9 October 2010
Citation
Morante, R. and Daelemans, W. A metalearning approach to processing the scope of negation. Proceedings of the Thirteenth Conference on Computational Natural Language Learning (2009).
Online Version
[citeseer]
Summary
This paper attempts to tackle a novel problem of determining the precise scope of a negation term in biomedical text. Previous papers had shown that for a given medical concept (like a patient's disease), its negation status could be assigned with high accuracy, but none had examined, for a given negation term, determining its precise scope. Negation (along with other context cues such as uncertainty) are crucial to proper IE of medical records, and increasingly for other applicatons such as sentiment detection.
The approach proceeded in two phases: negative term identification and then scope identification. Negative tokens were tagged as either beginning, inside, or outside a negative signal using an information gain decision tree with local features. Some words are unambiguously negative ("no", "lack", "absent" etc) and simply automatically assigned as negative.
Scope was decided by classifying tokens as either being the first element of the scope, last, or neither by 3 classifiers: a kNN, SVM, and CRF. A CRF meta-classifier takes these results and more features to assign final scope tags.
Determining scope accurately turns out to be a fairly difficult task, with a PCS measure (whole scope is correct or not) ranging from 0.40 to 0.70 depending on the data set. F1 for token by token were better, from 0.70 to 0.85. This does not compare too well to the regex-based, medical concept negation classifiers mentioned above (F1 ~0.95), but that is somewhat apples to oranges.
Related Papers
Regex and rules based negation classifiers frequently used in the clinical domain include Chapman et al J Biomed Inform 2001, Mutalik et al J Am Med Inform Assoc 2001, and Elkin et al BMC Medical Informatics and Decision Making 2005.
Councill et al Workshop on Negation and Speculation in NLP 2010 followed up with a simpler CRF model that performs well also on sentiment analysis.
The dataset consisted of clinical reports, biomedical texts, and biomedical abstracts annotated for various scopes by Vincze et al BMC Bioinformatics 2008.