Difference between revisions of "Hyejuj project abstract"
PastStudents (talk | contribs) |
PastStudents (talk | contribs) |
||
Line 40: | Line 40: | ||
I have developed a semantic tagger using Hidden Markov Model (HMM) in 2006. | I have developed a semantic tagger using Hidden Markov Model (HMM) in 2006. | ||
At that time, the target semantic tags were "Symptom", "Therapy", and "Performance." | At that time, the target semantic tags were "Symptom", "Therapy", and "Performance." | ||
+ | |||
+ | {| class="wikitable" style="margin: 1em auto 1em auto" | ||
+ | |+ '''Cells left-aligned, table centered''' | ||
+ | ! Duis || aute | ||
+ | |- | ||
+ | | UMLS tag for cause || Biomedical or Dental Material, Food | ||
+ | |- | ||
+ | | UMLS tag for disease or symptom || Finding, Sign or Symptom, Disease or Syndrome, Neoplastic Process | ||
+ | |- | ||
+ | | UMLS tag for therapy || Diagnostic Procedure, Food, Medical Device, The rapeutic or Preventive Procedure | ||
+ | |- | ||
+ | | Clue word for therapy || 처방(prescription), 복용(administer medicine), 시행(operation), 후(after), 이후(later), 사용(use), 증량(increase), 수술(surgery), 중단 (discontinue) | ||
+ | |- | ||
+ | | Clue word for symptom || 발열(having fever), 관찰(observe) | ||
+ | |- | ||
+ | | Clue word for performance || 호전(improvement), 감소(decrease), 상승(rise), 정상(normal), 발생(occurrence), 변화(change) | ||
+ | |- | ||
+ | | Numeric for Date || Date of the event, time-order information | ||
+ | |- | ||
+ | | Numeric for prescription || The frequency of taking medication, does information | ||
+ | |- | ||
+ | | unknown || neither clue word nor UMLS tag | ||
+ | |- | ||
+ | |} | ||
== References == | == References == | ||
* [http://ir.kaist.ac.kr/papers/2006/Integration%20of%20Low%20Level%20Linguistic%20Information%20for%20Clinical%20Document%20Semantic%20Tagging.pdf Hyeju Jang, Yun Jin, Sung Hyon Myaeng, ''Integration of Low Level Linguistic Information for Clinical Document Semantic Tagging'', IEEE Conf. on Information Reuse and Integration 2006.] | * [http://ir.kaist.ac.kr/papers/2006/Integration%20of%20Low%20Level%20Linguistic%20Information%20for%20Clinical%20Document%20Semantic%20Tagging.pdf Hyeju Jang, Yun Jin, Sung Hyon Myaeng, ''Integration of Low Level Linguistic Information for Clinical Document Semantic Tagging'', IEEE Conf. on Information Reuse and Integration 2006.] |
Revision as of 17:12, 8 October 2010
Contents
What I plan to do
I propose a semantic tagger that provides high level concept information for phrases in clinical narrative texts. I am going to use clinical narrative documents written by Korean doctors. The high level concept information which will be annotated is below.
Target Semantic Tag
- Symptom
- Diagnosis
- Test
- Test Result
- Treatment Plan
- Treatment
- Treatment Stop
- Performance
- Patient Result
Motivation
Clinical documents are invaluable information which can be used for medical research and future treatment plan. However, they are not utilized in hospital efficiently, and most of jobs are being performed manually because there are no tools to process such clinical texts automatically in Korea. Semantic tagging on clinical documents will be able to help developing applications which can be useful for doctors.
Interesting point
The clinical documents are written in both Korean and English. Usually, English is used for the medical terminologies, and Korean is used for some general nouns and most verbs though there are many exceptions.
Dataset
I have 600 clinical narrative documents. They have been tagged with Unifies Medical Language System (UMLS), Part-of-Speech (POS) automatically. They also have been tagged with the target semantic tags manually.
Evaluation
The performance of the system can be measured as the level of accuracy of annotation, and it can be calculated as the number of correct tags per the total number of tags.
Techniques that can be used to solve this problem
- To use UMLS, POS, abbreviation, clue words, and numerical information to produce higher level concept information.
- To use Conditional Random Field
What question to answer
Can we show good performance on high-level semantic tagging using CRF?
Team Member
Hyeju Jang [hyejuj@cs.cmu.edu]
Related Experience
I have developed a semantic tagger using Hidden Markov Model (HMM) in 2006. At that time, the target semantic tags were "Symptom", "Therapy", and "Performance."
Duis | aute |
---|---|
UMLS tag for cause | Biomedical or Dental Material, Food |
UMLS tag for disease or symptom | Finding, Sign or Symptom, Disease or Syndrome, Neoplastic Process |
UMLS tag for therapy | Diagnostic Procedure, Food, Medical Device, The rapeutic or Preventive Procedure |
Clue word for therapy | 처방(prescription), 복용(administer medicine), 시행(operation), 후(after), 이후(later), 사용(use), 증량(increase), 수술(surgery), 중단 (discontinue) |
Clue word for symptom | 발열(having fever), 관찰(observe) |
Clue word for performance | 호전(improvement), 감소(decrease), 상승(rise), 정상(normal), 발생(occurrence), 변화(change) |
Numeric for Date | Date of the event, time-order information |
Numeric for prescription | The frequency of taking medication, does information |
unknown | neither clue word nor UMLS tag |