Can Language Models Induce Grammatical Knowledge from Indirect Evidence?

Miyu Oba; Yohei Oseki; Akiyo Fukatsu; Akari Haga; Hiroki Ouchi; Taro Watanabe; Saku Sugawara

doi:10.18653/v1/2024.emnlp-main.1146

Can Language Models Induce Grammatical Knowledge from Indirect Evidence?

Miyu Oba, Yohei Oseki, Akiyo Fukatsu, Akari Haga, Hiroki Ouchi, Taro Watanabe, Saku Sugawara

Abstract

What kinds of and how much data is necessary for language models to induce grammatical knowledge to judge sentence acceptability? Recent language models still have much room for improvement in their data efficiency compared to humans. This paper investigates whether language models efficiently use indirect data (indirect evidence), from which they infer sentence acceptability. In contrast, humans use indirect evidence efficiently, which is considered one of the inductive biases contributing to efficient language acquisition. To explore this question, we introduce the Wug InDirect Evidence Test (WIDET), a dataset consisting of training instances inserted into the pre-training data and evaluation instances. We inject synthetic instances with newly coined wug words into pretraining data and explore the model’s behavior on evaluation data that assesses grammatical acceptability regarding those words. We prepare the injected instances by varying their levels of indirectness and quantity. Our experiments surprisingly show that language models do not induce grammatical knowledge even after repeated exposure to instances with the same structure but differing only in lexical items from evaluation instances in certain language phenomena. Our findings suggest a potential direction for future research: developing models that use latent indirect evidence to induce grammatical knowledge.

Anthology ID:: 2024.emnlp-main.1146
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20591–20603
Language:
URL:: https://aclanthology.org/2024.emnlp-main.1146/
DOI:: 10.18653/v1/2024.emnlp-main.1146
Bibkey:
Cite (ACL):: Miyu Oba, Yohei Oseki, Akiyo Fukatsu, Akari Haga, Hiroki Ouchi, Taro Watanabe, and Saku Sugawara. 2024. Can Language Models Induce Grammatical Knowledge from Indirect Evidence?. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20591–20603, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Can Language Models Induce Grammatical Knowledge from Indirect Evidence? (Oba et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-main.1146.pdf

PDF Cite Search Fix data