This paper is a theoretical contribution to the debate on the learnability of syntax from a corpus without explicit syntax-specific guidance. Our approach originates in the observable structure of a corpus, which we use to define and isolate grammaticality (syntactic information) and meaning/pragmatics information. We describe the formal characteristics of an autonomous syntax and show that it becomes possible to search for syntax-based lexical categories with a simple optimization process, without any prior hypothesis on the form of the model.
Unsupervised Spectral Learning of WCFG as Low-rank Matrix Completion
Raphaël Bailly | Xavier Carreras | Franco M. Luque | Ariadna Quattoni
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing