Mikhail Kuklin


2025

pdf bib
Deep-change at CoMeDi: the Cross-Entropy Loss is not All You Need
Mikhail Kuklin | Nikolay Arefyev
Proceedings of Context and Meaning: Navigating Disagreements in NLP Annotation

Manual annotation of edges in Diachronic Word Usage Graphs is a critical step in creation of datasets for Lexical Semantic Change Detection tasks, but a very labour-intensive one. Annotators estimate if two senses of an ambiguous word expressed in two usages of this word are related and how. This is a variation of the Word-in-Context (WiC) task with some peculiarities, including diachronic data, an ordinal scale for annotations consisting of 4 values with pre-defined meanings (e.g. homonymy, polysemy), and special attention to the degree of disagreement between annotators which affects the further processing of the graph. CoMeDi is a shared task aiming at automating this annotation process. Participants are asked to predict the median annotation for a pair of usages in the first subtask, and estimate the disagreement between annotators in the second subtask. Together this gives some idea about the distribution of annotations we can get from humans for a given pair of usages. For the first subtask we tried several ways of adapting a binary WiC model to this 4 class problem. We discovered that further fine-tuning the model as a 4 class classifier on the training data of the shared task works significantly worse than thresholding the original binary model. For the second subtask our best results were achieved by building a model that predicts the whole multinomial distribution of annotations and calculating the disagreement from this distribution. Our solutions for both subtasks have outperformed all other participants of the shared task.

2024

pdf bib
Deep-change at AXOLOTL-24: Orchestrating WSD and WSI Models for Semantic Change Modeling
Denis Kokosinskii | Mikhail Kuklin | Nikolay Arefyev
Proceedings of the 5th Workshop on Computational Approaches to Historical Language Change