Insights into Analogy Completion from the Biomedical Domain

Denis Newman-Griffis, Albert Lai, Eric Fosler-Lussier


Abstract
Analogy completion has been a popular task in recent years for evaluating the semantic properties of word embeddings, but the standard methodology makes a number of assumptions about analogies that do not always hold, either in recent benchmark datasets or when expanding into other domains. Through an analysis of analogies in the biomedical domain, we identify three assumptions: that of a Single Answer for any given analogy, that the pairs involved describe the Same Relationship, and that each pair is Informative with respect to the other. We propose modifying the standard methodology to relax these assumptions by allowing for multiple correct answers, reporting MAP and MRR in addition to accuracy, and using multiple example pairs. We further present BMASS, a novel dataset for evaluating linguistic regularities in biomedical embeddings, and demonstrate that the relationships described in the dataset pose significant semantic challenges to current word embedding methods.
Anthology ID:
W17-2303
Volume:
BioNLP 2017
Month:
August
Year:
2017
Address:
Vancouver, Canada,
Editors:
Kevin Bretonnel Cohen, Dina Demner-Fushman, Sophia Ananiadou, Junichi Tsujii
Venue:
BioNLP
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
19–28
Language:
URL:
https://aclanthology.org/W17-2303/
DOI:
10.18653/v1/W17-2303
Bibkey:
Cite (ACL):
Denis Newman-Griffis, Albert Lai, and Eric Fosler-Lussier. 2017. Insights into Analogy Completion from the Biomedical Domain. In BioNLP 2017, pages 19–28, Vancouver, Canada,. Association for Computational Linguistics.
Cite (Informal):
Insights into Analogy Completion from the Biomedical Domain (Newman-Griffis et al., BioNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-2303.pdf
Code
 OSU-slatelab/BMASS