Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging

Fabian David Schmidt, Ivan Vulić, Goran Glavaš


Abstract
Massively multilingual language models have displayed strong performance in zero-shot (ZS-XLT) and few-shot (FS-XLT) cross-lingual transfer setups, where models fine-tuned on task data in a source language are transferred without any or with only a few annotated instances to the target language(s). However, current work typically overestimates model performance as fine-tuned models are frequently evaluated at model checkpoints that generalize best to validation instances in the target languages. This effectively violates the main assumptions of ‘true’ ZS-XLT and FS-XLT. Such XLT setups require robust methods that do not depend on labeled target language data for validation and model selection. In this work, aiming to improve the robustness of ‘true’ ZS-XLT and FS-XLT, we propose a simple and effective method that averages different checkpoints (i.e., model snapshots) during task fine-tuning. We conduct exhaustive ZS-XLT and FS-XLT experiments across higher-level semantic tasks (NLI, extractive QA) and lower-level token classification tasks (NER, POS). The results indicate that averaging model checkpoints yields systematic and consistent performance gains across diverse target languages in all tasks. Importantly, it simultaneously substantially desensitizes XLT to varying hyperparameter choices in the absence of target language validation. We also show that checkpoint averaging benefits performance when further combined with run averaging (i.e., averaging the parameters of models fine-tuned over independent runs).
Anthology ID:
2023.acl-long.314
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5712–5730
Language:
URL:
https://aclanthology.org/2023.acl-long.314
DOI:
10.18653/v1/2023.acl-long.314
Bibkey:
Cite (ACL):
Fabian David Schmidt, Ivan Vulić, and Goran Glavaš. 2023. Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5712–5730, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging (Schmidt et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.314.pdf
Video:
 https://aclanthology.org/2023.acl-long.314.mp4