Cohen Courses:Dmovshov abbreviations
From Cohen Courses
Jump to navigationJump to searchContents
Identifying Abbreviations in Biomedical Text
Idea
Abbreviations, synonyms and acronyms are heavily used in biomedical literature, for describing names of genes, diseases, biological processes and more. Recognizing short or alternative name forms and mapping them to the full form is important to the full understanding of the scientific text. In the context of information extraction task, recognizing abbreviated names can lead to a great increase in recall. This task is especially challenging since abbreviations are often reused, for example, names of genes and systems are shared across species, and since researchers often do not adhere to standard naming conventions.
Team
Dana Movshovitz-Attias
Dataset
MEDSTRACT is a tool for automated extraction of acronym pairs from MEDLINE databases. If includes:
- Gold Standard Data: Sentences including abbreviations.
- Gold Standard Results: Pairs of abbreviation and full form name, that appear in the data.
Related Work
- A simple algorithm for identifying abbreviation definitions in biomedical text by A. S. Schwartz and M. A. Hearst
- An Automatic Identification and Resolution System for Protein-Related Abbreviations in Scientific Papers by Paolo Atzeni, Fabio Polticelli and Daniele Toti
- Mapping Abbreviations to Full Forms in Biomedical Articles by Hong Yu, George Hripcsak and Carol Friedman