Raghuraman Swaminathan


2023

pdf bib
Token-level Identification of Multiword Expressions using Pre-trained Multilingual Language Models
Raghuraman Swaminathan | Paul Cook
Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023)

In this paper, we consider novel cross-lingual settings for multiword expression (MWE) identification (Ramisch et al., 2020) and idiomaticity prediction (Tayyar Madabushi et al., 2022) in which systems are tested on languages that are unseen during training. Our findings indicate that pre-trained multilingual language models are able to learn knowledge about MWEs and idiomaticity that is not languagespecific. Moreover, we find that training data from other languages can be leveraged to give improvements over monolingual models.
Search
Co-authors
Venues