CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs

Luca Capone; Alessandro Bondielli; Alessandro Lenci

doi:10.18653/v1/2025.babylm-main.30

CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs

Luca Capone, Alessandro Bondielli, Alessandro Lenci

Abstract

This work investigates whether small-scale LMs can benefit from instruction tuning (IT). We compare conversational and question–answering IT datasets, applied either in a merged or sequential curriculum, using decoder-only models with 100M and 140M parameters. Evaluation spans both fine-tuning (SuperGLUE) and zero-shot (BLiMP, EWoK, WUGs, entity tracking, and psycholinguistic correlation) settings. Results show that IT yields small but consistent gains in fine-tuning scenarios, with sequential curricula outperforming merged data; however, improvements do not consistently transfer to zero-shot tasks, suggesting a trade-off between interaction-focused adaptation and broad linguistic generalization. These results highlight both the potential and the constraints of adapting human-inspired learning strategies to low-resource LMs, and point toward hybrid, curriculum-based approaches for enhancing generalization under ecological training limits.

Anthology ID:: 2025.babylm-main.30
Volume:: Proceedings of the First BabyLM Workshop
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Lucas Charpentier, Leshem Choshen, Ryan Cotterell, Mustafa Omer Gul, Michael Y. Hu, Jing Liu, Jaap Jumelet, Tal Linzen, Aaron Mueller, Candace Ross, Raj Sanjay Shah, Alex Warstadt, Ethan Gotlieb Wilcox, Adina Williams
Venue:: BabyLM
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 436–444
Language:
URL:: https://aclanthology.org/2025.babylm-main.30/
DOI:: 10.18653/v1/2025.babylm-main.30
Bibkey:
Cite (ACL):: Luca Capone, Alessandro Bondielli, and Alessandro Lenci. 2025. CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs. In Proceedings of the First BabyLM Workshop, pages 436–444, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction Tuning for BabyLMs (Capone et al., BabyLM 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.babylm-main.30.pdf

PDF Cite Search Fix data