Difference between revisions of "Heinze et al AMIA 2001"

From Cohen Courses
Jump to navigationJump to search
(Created page with '== Citation == Daniel T. Heinze et al. 2001. Mining Free-Text Medical Records. In Proceedings of AMIA Symposium, 254-258. == Online version == [http://www.ncbi.nlm.nih.gov/pmc…')
 
 
Line 34: Line 34:
 
== Related papers ==
 
== Related papers ==
  
The widely cited [[RelatedPaper::Pang et al EMNLP 2002]] paper was influenced by this paper - but considers supervised learning techniques.  The choice of movie reviews as the domain was suggested by the (relatively) poor performance of Turney's method on movies.
+
A follow-up paper is [[RelatedPaper::Denecke and Bernauer AIME 2007]].
 
 
An interesting follow-up paper is [[RelatedPaper::Turney and Littman, TOIS 2003]] which focuses on evaluation of the technique of using PMI for predicting the [[semantic orientation of words]].
 

Latest revision as of 01:01, 1 October 2010

Citation

Daniel T. Heinze et al. 2001. Mining Free-Text Medical Records. In Proceedings of AMIA Symposium, 254-258.

Online version

AMIA Annual Symposium Proceedings Archive

Summary

The paper presents a MEDical Information Extraction (MedIE) system, which extracts patient information from free-text clinical records.

They divided their extraction job into three tasks below.

  • extraction of medical terms
  • relation extraction
    • extraction of associated medical concepts
    • e.g. Blood pressure & 144/90 in the sentence, "Blood pressure is 144/90"
  • text classification
    • e.g. a patient can be classified as a former smoker, a current smoker, or a non-smoker

Their approaches are:

  • An ontology-based approach for extracting medical terms of interest
    • they used Unified Medical Language System (UMLS)
    • About terms that are not defined in UMLS, they predicted categories of some terms using sentence structures.
  • A graph-based approach which uses the parsing result of link-grammar parser for relation-extraction
    • They included the processing of negation.
    • When the parser fails, they used a pattern-based approach.
    • Because the parser did not process multi-word terms, they replaced the terms with placeholders.
  • an NLP-based feature extraction method coupled with an ID3-based decision tree for text classification


This approach was fairly successful mostly showing over 80% of precision and recall. However, the system was tested on the data written by only a clinician, which means that the style of free-text records was consistent.

Related papers

A follow-up paper is Denecke and Bernauer AIME 2007.