Modeling Overregularization in Children with Small Language Models

Akari Haga, Saku Sugawara, Akiyo Fukatsu, Miyu Oba, Hiroki Ouchi, Taro Watanabe, Yohei Oseki


Abstract
The imitation of the children’s language acquisition process has been explored to make language models (LMs) more efficient.In particular, errors caused by children’s regularization (so-called overregularization, e.g., using wroted for the past tense of write) have been widely studied to reveal the mechanisms of language acquisition. Existing research has analyzed regularization in language acquisition only by modeling word inflection directly, which is unnatural in light of human language acquisition. In this paper, we hypothesize that language models that imitate the errors children make during language acquisition have a learning process more similar to humans. To verify this hypothesis, we analyzed the learning curve and error preferences of verb inflections in small-scale LMs using acceptability judgments. We analyze the differences in results by model architecture, data, and tokenization. Our model shows child-like U-shaped learning curves clearly for certain verbs, but the preferences for types of overgeneralization did not fully match the observations in children.
Anthology ID:
2024.findings-acl.865
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14532–14550
Language:
URL:
https://aclanthology.org/2024.findings-acl.865
DOI:
Bibkey:
Cite (ACL):
Akari Haga, Saku Sugawara, Akiyo Fukatsu, Miyu Oba, Hiroki Ouchi, Taro Watanabe, and Yohei Oseki. 2024. Modeling Overregularization in Children with Small Language Models. In Findings of the Association for Computational Linguistics ACL 2024, pages 14532–14550, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Modeling Overregularization in Children with Small Language Models (Haga et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.865.pdf