Fifty shapes of BLiMP: syntactic learning curves in language models are not uniform, but sometimes unruly

Bastian Bunzeck, Sina Zarrieß


Abstract
Syntactic learning curves in LMs are usually reported as relatively stable and power law-shaped. By analyzing the learning curves of different LMs on various syntactic phenomena using both small self-trained llama models and larger pre-trained pythia models, we show that while many phenomena do follow typical power law curves, others exhibit S-shaped, U-shaped, or erratic patterns. Certain syntactic paradigms remain challenging even for large models, resulting in persistent preference for ungrammatical sentences. Most phenomena show similar curves for their paradigms, but the existence of diverging patterns and oscillations indicates that average curves mask important developments, underscoring the need for more detailed analyses of individual learning trajectories.
Anthology ID:
2024.clasp-1.7
Volume:
Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning
Month:
October
Year:
2024
Address:
Gothenburg, Sweden
Editors:
Amy Qiu, Bill Noble, David Pagmar, Vladislav Maraev, Nikolai Ilinykh
Venue:
CLASP
SIG:
SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
39–55
Language:
URL:
https://aclanthology.org/2024.clasp-1.7
DOI:
Bibkey:
Cite (ACL):
Bastian Bunzeck and Sina Zarrieß. 2024. Fifty shapes of BLiMP: syntactic learning curves in language models are not uniform, but sometimes unruly. In Proceedings of the 2024 CLASP Conference on Multimodality and Interaction in Language Learning, pages 39–55, Gothenburg, Sweden. Association for Computational Linguistics.
Cite (Informal):
Fifty shapes of BLiMP: syntactic learning curves in language models are not uniform, but sometimes unruly (Bunzeck & Zarrieß, CLASP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clasp-1.7.pdf