One Classifier for All Ambiguous Words: Overcoming Data Sparsity by Utilizing Sense Correlations Across Words

Prafulla Kumar Choubey; Ruihong Huang

One Classifier for All Ambiguous Words: Overcoming Data Sparsity by Utilizing Sense Correlations Across Words

Abstract

Most supervised word sense disambiguation (WSD) systems build word-specific classifiers by leveraging labeled data. However, when using word-specific classifiers, the sparseness of annotations leads to inferior sense disambiguation performance on less frequently seen words. To combat data sparsity, we propose to learn a single model that derives sense representations and meanwhile enforces congruence between a word instance and its right sense by using both sense-annotated data and lexical resources. The model is shared across words that allows utilizing sense correlations across words, and therefore helps to transfer common disambiguation rules from annotation-rich words to annotation-lean words. Empirical evaluation on benchmark datasets shows that the proposed shared model outperforms the equivalent classifier-based models by 1.7%, 2.5% and 3.8% in F1-score when using GloVe, ELMo and BERT word embeddings respectively.

Anthology ID:: 2020.lrec-1.732
Volume:: Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:: May
Year:: 2020
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 5978–5985
Language:: English
URL:: https://aclanthology.org/2020.lrec-1.732/
DOI:
Bibkey:
Cite (ACL):: Prafulla Kumar Choubey and Ruihong Huang. 2020. One Classifier for All Ambiguous Words: Overcoming Data Sparsity by Utilizing Sense Correlations Across Words. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5978–5985, Marseille, France. European Language Resources Association.
Cite (Informal):: One Classifier for All Ambiguous Words: Overcoming Data Sparsity by Utilizing Sense Correlations Across Words (Choubey & Huang, LREC 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.lrec-1.732.pdf

PDF Cite Search Fix data