BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pre-training of BabyLM

Zhewen Shen; Aditya Joshi; Ruey-Cheng Chen

doi:10.18653/v1/2024.cmcl-1.1

BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pre-training of BabyLM

Zhewen Shen, Aditya Joshi, Ruey-Cheng Chen

Abstract

Children from bilingual backgrounds benefit from interactions with parents and teachers to re-acquire their heritage language. In this paper, we investigate how this insight from behavioral study can be incorporated into the learning of small-scale language models. We introduce BAMBINO-LM, a continual pre-training strategy for BabyLM that uses a novel combination of alternation and PPO-based perplexity reward induced from a parent Italian model. Upon evaluation on zero-shot classification tasks for English and Italian, BAMBINO-LM improves the Italian language capability of a BabyLM baseline. Our ablation analysis demonstrates that employing both the alternation strategy and PPO-based modeling is key to this effectiveness gain. We also show that, as a side effect, the proposed method leads to a similar degradation in L1 effectiveness as human children would have had in an equivalent learning scenario. Through its modeling and findings, BAMBINO-LM makes a focused contribution to the pre-training of small-scale language models by first developing a human-inspired strategy for pre-training and then showing that it results in behaviours similar to that of humans.

Anthology ID:: 2024.cmcl-1.1
Volume:: Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Tatsuki Kuribayashi, Giulia Rambelli, Ece Takmaz, Philipp Wicke, Yohei Oseki
Venues:: CMCL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–7
Language:
URL:: https://aclanthology.org/2024.cmcl-1.1/
DOI:: 10.18653/v1/2024.cmcl-1.1
Bibkey:
Cite (ACL):: Zhewen Shen, Aditya Joshi, and Ruey-Cheng Chen. 2024. BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pre-training of BabyLM. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pages 1–7, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: BAMBINO-LM: (Bilingual-)Human-Inspired Continual Pre-training of BabyLM (Shen et al., CMCL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.cmcl-1.1.pdf

PDF Cite Search Fix data