A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation

Li, Irene; Yasunaga, Michihiro; Nuzumlalı, Muhammed Yavuz; Caraballo, Cesar; Mahajan, Shiwani; Krumholz, Harlan; Radev, Dragomir

Computer Science > Computation and Language

arXiv:1910.14076 (cs)

[Submitted on 30 Oct 2019]

Title:A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation

Authors:Irene Li, Michihiro Yasunaga, Muhammed Yavuz Nuzumlalı, Cesar Caraballo, Shiwani Mahajan, Harlan Krumholz, Dragomir Radev

View PDF

Abstract:Automated analysis of clinical notes is attracting increasing attention. However, there has not been much work on medical term abbreviation disambiguation. Such abbreviations are abundant, and highly ambiguous, in clinical documents. One of the main obstacles is the lack of large scale, balance labeled data sets. To address the issue, we propose a few-shot learning approach to take advantage of limited labeled data. Specifically, a neural topic-attention model is applied to learn improved contextualized sentence representations for medical term abbreviation disambiguation. Another vital issue is that the existing scarce annotations are noisy and missing. We re-examine and correct an existing dataset for training and collect a test set to evaluate the models fairly especially for rare senses. We train our model on the training set which contains 30 abbreviation terms as categories (on average, 479 samples and 3.24 classes in each term) selected from a public abbreviation disambiguation dataset, and then test on a manually-created balanced dataset (each class in each term has 15 samples). We show that enhancing the sentence representation with topic information improves the performance on small-scale unbalanced training datasets by a large margin, compared to a number of baseline models.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1910.14076 [cs.CL]
	(or arXiv:1910.14076v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1910.14076

Submission history

From: Irene Li [view email]
[v1] Wed, 30 Oct 2019 18:39:46 UTC (2,106 KB)

Computer Science > Computation and Language

Title:A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators