Homonymy Information for English WordNet

Rowan Hall Maudslay, Simone Teufel


Abstract
A widely acknowledged shortcoming of WordNet is that it lacks a distinction between word meanings which are systematically related (polysemy), and those which are coincidental (homonymy). Several previous works have attempted to fill this gap, by inferring this information using computational methods. We revisit this task, and exploit recent advances in language modelling to synthesise homonymy annotation for Princeton WordNet. Previous approaches treat the problem using clustering methods; by contrast, our method works by linking WordNet to the Oxford English Dictionary, which contains the information we need. To perform this alignment, we pair definitions based on their proximity in an embedding space produced by a Transformer model. Despite the simplicity of this approach, our best model attains an F1 of .97 on an evaluation set that we annotate. The outcome of our work is a high-quality homonymy annotation layer for Princeton WordNet, which we release.
Anthology ID:
2022.gwll-1.13
Volume:
Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Ilan Kernerman, Simon Krek
Venue:
gwll
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
90–98
Language:
URL:
https://aclanthology.org/2022.gwll-1.13
DOI:
Bibkey:
Cite (ACL):
Rowan Hall Maudslay and Simone Teufel. 2022. Homonymy Information for English WordNet. In Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference, pages 90–98, Marseille, France. European Language Resources Association.
Cite (Informal):
Homonymy Information for English WordNet (Maudslay & Teufel, gwll 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.gwll-1.13.pdf
Code
 rowanhm/wordnet-homonymy