Transformer-based Speech Model Learns Well as Infants and Encodes Abstractions through Exemplars in the Poverty of the Stimulus Environment

Yi Yang; Yiming Wang; Jiahong Yuan

Transformer-based Speech Model Learns Well as Infants and Encodes Abstractions through Exemplars in the Poverty of the Stimulus Environment

Abstract

Infants are capable of learning language, predominantly through speech and associations, in impoverished environments—a phenomenon known as the Poverty of the Stimulus (POS). Is this ability uniquely human, as an innate linguistic predisposition, or can it be empirically learned through potential linguistic structures from sparse and noisy exemplars? As an early exploratory work, we systematically designed a series of tasks, scenarios, and metrics to simulate the POS. We found that the emerging speech model wav2vec2.0 with pretrained weights from an English corpus can learn well in noisy and sparse Mandarin environments. We then tested various hypotheses and observed three pieces of evidence for abstraction: label correction, categorical patterns, and clustering effects. We concluded that models can encode hierarchical linguistic abstractions through exemplars in POS environments. We hope this work offers new insights into language acquisition from a speech perspective and inspires further research.

Anthology ID:: 2025.coling-main.528
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7881–7890
Language:
URL:: https://aclanthology.org/2025.coling-main.528/
DOI:
Bibkey:
Cite (ACL):: Yi Yang, Yiming Wang, and Jiahong Yuan. 2025. Transformer-based Speech Model Learns Well as Infants and Encodes Abstractions through Exemplars in the Poverty of the Stimulus Environment. In Proceedings of the 31st International Conference on Computational Linguistics, pages 7881–7890, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Transformer-based Speech Model Learns Well as Infants and Encodes Abstractions through Exemplars in the Poverty of the Stimulus Environment (Yang et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.528.pdf

PDF Cite Search Fix data