Regularising Fisher Information Improves Cross-lingual Generalisation

Asa Cooper Stickland, Iain Murray


Abstract
Many recent works use ‘consistency regularisation’ to improve the generalisation of fine-tuned pre-trained models, both multilingual and English-only. These works encourage model outputs to be similar between a perturbed and normal version of the input, usually via penalising the Kullback–Leibler (KL) divergence between the probability distribution of the perturbed and normal model. We believe that consistency losses may be implicitly regularizing the loss landscape. In particular, we build on work hypothesising that implicitly or explicitly regularizing trace of the Fisher Information Matrix (FIM), amplifies the implicit bias of SGD to avoid memorization. Our initial results show both empirically and theoretically that consistency losses are related to the FIM, and show that the flat minima implied by a small trace of the FIM improves performance when fine-tuning a multilingual model on additional languages. We aim to confirm these initial results on more datasets, and use our insights to develop better multilingual fine-tuning techniques.
Anthology ID:
2021.mrl-1.20
Volume:
Proceedings of the 1st Workshop on Multilingual Representation Learning
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Duygu Ataman, Alexandra Birch, Alexis Conneau, Orhan Firat, Sebastian Ruder, Gozde Gul Sahin
Venue:
MRL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
238–241
Language:
URL:
https://aclanthology.org/2021.mrl-1.20
DOI:
10.18653/v1/2021.mrl-1.20
Bibkey:
Cite (ACL):
Asa Cooper Stickland and Iain Murray. 2021. Regularising Fisher Information Improves Cross-lingual Generalisation. In Proceedings of the 1st Workshop on Multilingual Representation Learning, pages 238–241, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Regularising Fisher Information Improves Cross-lingual Generalisation (Cooper Stickland & Murray, MRL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.mrl-1.20.pdf
Video:
 https://aclanthology.org/2021.mrl-1.20.mp4
Data
XNLI