On the Acquisition of Shared Grammatical Representations in Bilingual Language Models

Catherine Arnett; Tyler A. Chang; James A. Michaelov; Benjamin K. Bergen

doi:10.18653/v1/2025.acl-long.1010

On the Acquisition of Shared Grammatical Representations in Bilingual Language Models

Catherine Arnett, Tyler A. Chang, James A. Michaelov, Ben Bergen

Abstract

Crosslingual transfer is crucial to contemporary language models’ multilingual capabilities, but how it occurs is not well understood. Weask what happens to a monolingual language model when it begins to be trained on a second language. Specifically, we train small bilingual models for which we control the amount of data for each language and the order of language exposure. To find evidence of shared multilingual representations, we turn to structural priming, a method used to study grammatical representations in humans. We first replicate previous crosslingual structural priming results and find that after controlling for training data quantity and language exposure, there are asymmetrical effects across language pairs and directions. We argue that this asymmetry may shape hypotheses about human structural priming effects. We also find that structural priming effects are less robust for less similar language pairs, highlighting potential limitations of crosslingual transfer learning and shared representations for typologically diverse languages.

Anthology ID:: 2025.acl-long.1010
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20707–20726
Language:
URL:: https://aclanthology.org/2025.acl-long.1010/
DOI:: 10.18653/v1/2025.acl-long.1010
Bibkey:
Cite (ACL):: Catherine Arnett, Tyler A. Chang, James A. Michaelov, and Ben Bergen. 2025. On the Acquisition of Shared Grammatical Representations in Bilingual Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 20707–20726, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: On the Acquisition of Shared Grammatical Representations in Bilingual Language Models (Arnett et al., ACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.acl-long.1010.pdf

PDF Cite Search Fix data