Do BabyLMs Wanna Learn Wanna Contraction? On the Learnability without Language-Specific Bias

Kangsan Noh; Sanghoun Song

Do BabyLMs Wanna Learn Wanna Contraction? On the Learnability without Language-Specific Bias

Abstract

This study investigates whether the grammatical constraints on wanna contraction—a phenomenon traditionally cited as evidence for innate linguistic knowledge—can be learned via BabyLMs, which are designed to reflect cognitively plausible learning conditions. Two datasets were constructed from the CHILDES corpus, varying in embedded verb frequency (high vs. low) and grammaticality, and contrasting grammatical instances (object extraction contexts) with ungrammatical ones (subject extraction contexts) of wanna contractions. Using surprisal as a metric, we evaluated 24 BabyLMs from the 2024 BabyLM Challenge alongside four standard models, including BERT and GPT-2. While the standard models performed with near-perfect consistency, the BabyLMs showed modest but meaningful sensitivity, particularly those trained on larger datasets and tested on high-frequency wanna instances. In particular, only encoder-based BabyLMs captured the grammatical constraint, with babylm24_MLSM exhibiting consistent performance. Nonetheless, our findings provide evidence for limited and conditional learnability of wanna contraction by artificial learners under cognitively realistic input conditions.

Anthology ID:: 2026.findings-acl.552
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11361–11372
Language:
URL:: https://aclanthology.org/2026.findings-acl.552/
DOI:
Bibkey:
Cite (ACL):: Kangsan Noh and Sanghoun Song. 2026. Do BabyLMs Wanna Learn Wanna Contraction? On the Learnability without Language-Specific Bias. In Findings of the Association for Computational Linguistics: ACL 2026, pages 11361–11372, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Do BabyLMs Wanna Learn Wanna Contraction? On the Learnability without Language-Specific Bias (Noh & Song, Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.552.pdf
Checklist:: 2026.findings-acl.552.checklist.pdf

PDF Cite Search Checklist Fix data