Depth-Bounded Statistical PCFG Induction as a Model of Human Grammar Acquisition

Lifeng Jin, Lane Schwartz, Finale Doshi-Velez, Timothy Miller, William Schuler


Abstract
Abstract This article describes a simple PCFG induction model with a fixed category domain that predicts a large majority of attested constituent boundaries, and predicts labels consistent with nearly half of attested constituent labels on a standard evaluation data set of child-directed speech. The article then explores the idea that the difference between simple grammars exhibited by child learners and fully recursive grammars exhibited by adult learners may be an effect of increasing working memory capacity, where the shallow grammars are constrained images of the recursive grammars. An implementation of these memory bounds as limits on center embedding in a depth-specific transform of a recursive grammar yields a significant improvement over an equivalent but unbounded baseline, suggesting that this arrangement may indeed confer a learning advantage.
Anthology ID:
2021.cl-1.7
Volume:
Computational Linguistics, Volume 47, Issue 1 - March 2021
Month:
March
Year:
2021
Address:
Cambridge, MA
Venue:
CL
SIG:
Publisher:
MIT Press
Note:
Pages:
181–216
Language:
URL:
https://aclanthology.org/2021.cl-1.7
DOI:
10.1162/coli_a_00399
Bibkey:
Cite (ACL):
Lifeng Jin, Lane Schwartz, Finale Doshi-Velez, Timothy Miller, and William Schuler. 2021. Depth-Bounded Statistical PCFG Induction as a Model of Human Grammar Acquisition. Computational Linguistics, 47(1):181–216.
Cite (Informal):
Depth-Bounded Statistical PCFG Induction as a Model of Human Grammar Acquisition (Jin et al., CL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.cl-1.7.pdf
Data
Penn Treebank