Word Sense Induction with Attentive Context Clustering

Moshe Stekel, Amos Azaria, Shai Gordin


Abstract
In this paper, we present ACCWSI (Attentive Context Clustering WSI), a method for Word Sense Induction, suitable for languages with limited resources. Pretrained on a small corpus and given an ambiguous word (query word) and a set of excerpts that contain it, ACCWSI uses an attention mechanism for generating context-aware embeddings, distinguishing between the different senses assigned to the query word. These embeddings are then clustered to provide groups of main common uses of the query word. This method demonstrates practical applicability for shedding light on the meanings of ambiguous words in ancient languages, such as Classical Hebrew.
Anthology ID:
2021.nlp4dh-1.17
Volume:
Proceedings of the Workshop on Natural Language Processing for Digital Humanities
Month:
December
Year:
2021
Address:
NIT Silchar, India
Editors:
Mika Hämäläinen, Khalid Alnajjar, Niko Partanen, Jack Rueter
Venue:
NLP4DH
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
144–151
Language:
URL:
https://aclanthology.org/2021.nlp4dh-1.17
DOI:
Bibkey:
Cite (ACL):
Moshe Stekel, Amos Azaria, and Shai Gordin. 2021. Word Sense Induction with Attentive Context Clustering. In Proceedings of the Workshop on Natural Language Processing for Digital Humanities, pages 144–151, NIT Silchar, India. NLP Association of India (NLPAI).
Cite (Informal):
Word Sense Induction with Attentive Context Clustering (Stekel et al., NLP4DH 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nlp4dh-1.17.pdf