Difference between revisions of "Zhou et al ACM symposium on Applied Computing 2006"
PastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
Line 23: | Line 23: | ||
** they used Unified Medical Language System (UMLS) | ** they used Unified Medical Language System (UMLS) | ||
** About terms that are not defined in UMLS, they predicted categories of some terms using sentence structures. | ** About terms that are not defined in UMLS, they predicted categories of some terms using sentence structures. | ||
− | * A graph-based approach which uses the parsing result of link-grammar parser for | + | * A graph-based approach which uses the parsing result of link-grammar parser for [[AddressesProblem::Relation Extraction]] |
** They included the processing of negation. | ** They included the processing of negation. | ||
** When the parser fails, they used a pattern-based approach. | ** When the parser fails, they used a pattern-based approach. | ||
** Because the parser did not process multi-word terms, they replaced the terms with placeholders. | ** Because the parser did not process multi-word terms, they replaced the terms with placeholders. | ||
− | * an NLP-based feature extraction method coupled with an ID3-based | + | * an NLP-based feature extraction method coupled with an ID3-based [[AddressesProblem::Decision_Tree_Learning]] for [[AddressesProblem::Text Classification]] |
Revision as of 16:58, 9 October 2010
Citation
Ciaohua Zhou et al. 2006. Approaches to Text Mining for Clinical Medical Records. In Proceedings of the 2006 ACM symposium on Applied computing, 235-239.
Online version
Summary
The paper presents a MEDical Information Extraction (MedIE) system, which extracts patient information from free-text clinical records.
They divided their extraction job into three tasks below.
- Extraction of medical terms
- Relation Extraction
- extraction of associated medical concepts
- e.g. Blood pressure & 144/90 in the sentence, "Blood pressure is 144/90"
- Text Classification
- e.g. a patient can be classified as a former smoker, a current smoker, or a non-smoker
Their approaches are:
- An ontology-based approach for extracting medical terms of interest
- they used Unified Medical Language System (UMLS)
- About terms that are not defined in UMLS, they predicted categories of some terms using sentence structures.
- A graph-based approach which uses the parsing result of link-grammar parser for Relation Extraction
- They included the processing of negation.
- When the parser fails, they used a pattern-based approach.
- Because the parser did not process multi-word terms, they replaced the terms with placeholders.
- an NLP-based feature extraction method coupled with an ID3-based Decision_Tree_Learning for Text Classification
This approach was fairly successful mostly showing over 80% of precision and recall. However, the system was tested on the data written by only a clinician, which means that the style of free-text records was consistent. Nevertheless, the research is worth in that they applied various IE techniques to the free-text clinical records, explain about the problems they encountered.
Related papers
An interesting follow-up paper is Denecke and Bernauer AIME 2007 which uses semantic structures to extract medical information.