During the course of first language acquisition, children produce linguistic forms that do not conform to adult grammar. In this paper, we introduce a data set and approach for systematically modeling this child-adult grammar divergence. Our corpus consists of child sentences with corrected adult forms. We bridge the gap between these forms with a discriminatively reranked noisy channel model that translates child sentences into equivalent adult utterances. Our method outperforms MT and ESL baselines, reducing child error by 20%. Our model allows us to chart specific aspects of grammar development in longitudinal studies of children, and investigate the hypothesis that children share a common developmental path in language acquisition.
Automatically Learning Measures of Child Language Development
Sam Sahakian | Benjamin Snyder
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)