Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories

Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, Dong Yu


Abstract
Word Sense Disambiguation (WSD) aims to automatically identify the exact meaning of one word according to its context. Existing supervised models struggle to make correct predictions on rare word senses due to limited training data and can only select the best definition sentence from one predefined word sense inventory (e.g., WordNet). To address the data sparsity problem and generalize the model to be independent of one predefined inventory, we propose a gloss alignment algorithm that can align definition sentences (glosses) with the same meaning from different sense inventories to collect rich lexical knowledge. We then train a model to identify semantic equivalence between a target word in context and one of its glosses using these aligned inventories, which exhibits strong transfer capability to many WSD tasks. Experiments on benchmark datasets show that the proposed method improves predictions on both frequent and rare word senses, outperforming prior work by 1.2% on the All-Words WSD Task and 4.3% on the Low-Shot WSD Task. Evaluation on WiC Task also indicates that our method can better capture word meanings in context.
Anthology ID:
2021.emnlp-main.610
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7741–7751
Language:
URL:
https://aclanthology.org/2021.emnlp-main.610
DOI:
10.18653/v1/2021.emnlp-main.610
Bibkey:
Cite (ACL):
Wenlin Yao, Xiaoman Pan, Lifeng Jin, Jianshu Chen, Dian Yu, and Dong Yu. 2021. Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7741–7751, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories (Yao et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.610.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.610.mp4
Code
 tencent-ailab/EMNLP21_SemEq +  additional community code
Data
WiCWord Sense Disambiguation: a Unified Evaluation Framework and Empirical Comparison