Estimating senses with sets of lexically related words for Polish word sense disambiguation

Szymon Rutkowski, Piotr Rychlik, Agnieszka Mykowiecka


Abstract
We propose a new algorithm for word sense disambiguation, exploiting data from a WordNet with many types of lexical relations, such as plWordNet for Polish. In this method, sense probabilities in context are approximated with a language model. To estimate the likelihood of a sense appearing amidst the word sequence, the token being disambiguated is substituted with words related lexically to the given sense or words appearing in its WordNet gloss. We test this approach on a set of sense-annotated Polish sentences with a number of neural language models. Our best setup achieves the accuracy score of 55.12% (72.02% when first senses are excluded), up from 51.77% of an existing PageRank-based method. While not exceeding the first (often meaning most frequent) sense baseline in the standard case, this encourages further research on combining WordNet data with neural models.
Anthology ID:
2019.gwc-1.15
Volume:
Proceedings of the 10th Global Wordnet Conference
Month:
July
Year:
2019
Address:
Wroclaw, Poland
Editors:
Piek Vossen, Christiane Fellbaum
Venue:
GWC
SIG:
SIGLEX
Publisher:
Global Wordnet Association
Note:
Pages:
118–124
Language:
URL:
https://aclanthology.org/2019.gwc-1.15
DOI:
Bibkey:
Cite (ACL):
Szymon Rutkowski, Piotr Rychlik, and Agnieszka Mykowiecka. 2019. Estimating senses with sets of lexically related words for Polish word sense disambiguation. In Proceedings of the 10th Global Wordnet Conference, pages 118–124, Wroclaw, Poland. Global Wordnet Association.
Cite (Informal):
Estimating senses with sets of lexically related words for Polish word sense disambiguation (Rutkowski et al., GWC 2019)
Copy Citation:
PDF:
https://aclanthology.org/2019.gwc-1.15.pdf