Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks

Prajit Dhar, Arianna Bisazza


Abstract
It is now established that modern neural language models can be successfully trained on multiple languages simultaneously without changes to the underlying architecture, providing an easy way to adapt a variety of NLP models to low-resource languages. But what kind of knowledge is really shared among languages within these models? Does multilingual training mostly lead to an alignment of the lexical representation spaces or does it also enable the sharing of purely grammatical knowledge? In this paper we dissect different forms of cross-lingual transfer and look for its most determining factors, using a variety of models and probing tasks. We find that exposing our LMs to a related language does not always increase grammatical knowledge in the target language, and that optimal conditions for lexical-semantic transfer may not be optimal for syntactic transfer.
Anthology ID:
2021.nodalida-main.8
Volume:
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:
May 31--2 June
Year:
2021
Address:
Reykjavik, Iceland (Online)
Editors:
Simon Dobnik, Lilja Øvrelid
Venue:
NoDaLiDa
SIG:
Publisher:
Linköping University Electronic Press, Sweden
Note:
Pages:
74–85
Language:
URL:
https://aclanthology.org/2021.nodalida-main.8
DOI:
Bibkey:
Cite (ACL):
Prajit Dhar and Arianna Bisazza. 2021. Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pages 74–85, Reykjavik, Iceland (Online). Linköping University Electronic Press, Sweden.
Cite (Informal):
Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks (Dhar & Bisazza, NoDaLiDa 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nodalida-main.8.pdf
Data
XNLI