Using Two Losses and Two Datasets Simultaneously to Improve TempoWiC Accuracy

Mohammad Javad Pirhadi, Motahhare Mirzaei, Sauleh Eetemadi


Abstract
WSD (Word Sense Disambiguation) is the task of identifying which sense of a word is meant in a sentence or other segment of text. Researchers have worked on this task (e.g. Pustejovsky, 2002) for years but it’s still a challenging one even for SOTA (state-of-the-art) LMs (language models). The new dataset, TempoWiC introduced by Loureiro et al. (2022b) focuses on the fact that words change over time. Their best baseline achieves 70.33% macro-F1. In this work, we use two different losses simultaneously. We also improve our model by using another similar dataset to generalize better. Our best configuration beats their best baseline by 4.23%.
Anthology ID:
2022.evonlp-1.3
Volume:
Proceedings of the First Workshop on Ever Evolving NLP (EvoNLP)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Francesco Barbieri, Jose Camacho-Collados, Bhuwan Dhingra, Luis Espinosa-Anke, Elena Gribovskaya, Angeliki Lazaridou, Daniel Loureiro, Leonardo Neves
Venue:
EvoNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12–15
Language:
URL:
https://aclanthology.org/2022.evonlp-1.3
DOI:
10.18653/v1/2022.evonlp-1.3
Bibkey:
Cite (ACL):
Mohammad Javad Pirhadi, Motahhare Mirzaei, and Sauleh Eetemadi. 2022. Using Two Losses and Two Datasets Simultaneously to Improve TempoWiC Accuracy. In Proceedings of the First Workshop on Ever Evolving NLP (EvoNLP), pages 12–15, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Using Two Losses and Two Datasets Simultaneously to Improve TempoWiC Accuracy (Pirhadi et al., EvoNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.evonlp-1.3.pdf