ABBREVIATION AND ACRONYM IDENTIFICATION AND EXPANSION WITHIN MEDICAL HEALTH RECORDS
Recent years have seen the rapid increase in digitised medical information. In particular, the massive expansion of Electronic Health Records (EHRs), which are designed to document all information that is clinically relevant in a patient's use of a healthcare facility, has introduced unprecedented volumes of relatively unstructured data. This paper intends to determine the extent to which knowledge discovery in relation to both abbreviations and acronyms within heterogeneous data can be achieved. Heterogeneous data such as the narrative-based free-text notes found within patients' EHRs may use inconsistent ways to indicate contractions within the text and may use non-standard definitions for both abbreviations and acronyms. We approached this task through the retrieval and classification of contractions as well as using a novel method of combining multiple publically available repositories. In order to provide better coverage of abbreviations, and also to address the issue of neologisms in general, word embeddings were applied to find semantically similar lexemes.
National University of Ireland, Dublin (UCD)
Open access repository: