Parisa Rastin
2023
Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers
Felix Gaschi
|
Patricio Cerda
|
Parisa Rastin
|
Yannick Toussaint
Findings of the Association for Computational Linguistics: ACL 2023
Without any explicit cross-lingual training data, multilingual language models can achieve cross-lingual transfer. One common way to improve this transfer is to perform realignment steps before fine-tuning, i.e., to train the model to build similar representations for pairs of words from translated sentences. But such realignment methods were found to not always improve results across languages and tasks, which raises the question of whether aligned representations are truly beneficial for cross-lingual transfer. We provide evidence that alignment is actually significantly correlated with cross-lingual transfer across languages, models and random seeds. We show that fine-tuning can have a significant impact on alignment, depending mainly on the downstream task and the model. Finally, we show that realignment can, in some instances, improve cross-lingual transfer, and we identify conditions in which realignment methods provide significant improvements. Namely, we find that realignment works better on tasks for which alignment is correlated with cross-lingual transfer when generalizing to a distant language and with smaller models, as well as when using a bilingual dictionary rather than FastAlign to extract realignment pairs. For example, for POS-tagging, between English and Arabic, realignment can bring a +15.8 accuracy improvement on distilmBERT, even outperforming XLM-R Large by 1.7. We thus advocate for further research on realignment methods for smaller multilingual models as an alternative to scaling.
Multilingual Clinical NER: Translation or Cross-lingual Transfer?
Félix Gaschi
|
Xavier Fontaine
|
Parisa Rastin
|
Yannick Toussaint
Proceedings of the 5th Clinical Natural Language Processing Workshop
Natural language tasks like Named Entity Recognition (NER) in the clinical domain on non-English texts can be very time-consuming and expensive due to the lack of annotated data. Cross-lingual transfer (CLT) is a way to circumvent this issue thanks to the ability of multilingual large language models to be fine-tuned on a specific task in one language and to provide high accuracy for the same task in another language. However, other methods leveraging translation models can be used to perform NER without annotated data in the target language, by either translating the training set or test set. This paper compares cross-lingual transfer with these two alternative methods, to perform clinical NER in French and in German without any training data in those languages. To this end, we release MedNERF a medical NER test set extracted from French drug prescriptions and annotated with the same guidelines as an English dataset. Through extensive experiments on this dataset and on a German medical dataset (Frei and Kramer, 2021), we show that translation-based methods can achieve similar performance to CLT but require more care in their design. And while they can take advantage of monolingual clinical language models, those do not guarantee better results than large general-purpose multilingual models, whether with cross-lingual transfer or translation.
Code-switching as a cross-lingual Training Signal: an Example with Unsupervised Bilingual Embedding
Felix Gaschi
|
Ilias El-Baamrani
|
Barbara Gendron
|
Parisa Rastin
|
Yannick Toussaint
Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL)
Search
Co-authors
- Félix Gaschi 3
- Yannick Toussaint 3
- Patricio Cerda 1
- Xavier Fontaine 1
- Ilias El-Baamrani 1
- show all...