Daniil Homskiy
2023
DN at SemEval-2023 Task 12: Low-Resource Language Text Classification via Multilingual Pretrained Language Model Fine-tuning
Daniil Homskiy
|
Narek Maloyan
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
In our work, a model is implemented that solves the task, based on multilingual pre-trained models. We also consider various methods of data preprocessing
2022
DeepMistake at LSCDiscovery: Can a Multilingual Word-in-Context Model Replace Human Annotators?
Daniil Homskiy
|
Nikolay Arefyev
Proceedings of the 3rd Workshop on Computational Approaches to Historical Language Change
In this paper we describe our solution of the LSCDiscovery shared task on Lexical Semantic Change Discovery (LSCD) in Spanish. Our solution employs a Word-in-Context (WiC) model, which is trained to determine if a particular word has the same meaning in two given contexts. We basically try to replicate the annotation of the dataset for the shared task, but replacing human annotators with a neural network. In the graded change discovery subtask, our solution has achieved the 2nd best result according to all metrics. In the main binary change detection subtask, our F1-score is 0.655 compared to 0.716 of the best submission, corresponding to the 5th place. However, in the optional sense gain detection subtask we have outperformed all other participants. During the post-evaluation experiments we compared different ways to prepare WiC data in Spanish for fine-tuning. We have found that it helps leaving only examples annotated as 1 (unrelated senses) and 4 (identical senses) rather than using 2x more examples including intermediate annotations.
Search