Within the currently dominant Minimalist framework for syntax (Chomsky, 1995, 2000), it is not uncommon to encounter multiple proposals for the same natural language pattern in the literature. We investigate the possibility of evaluating and comparing analyses of syntax phenomena, implemented as minimalist grammars (Stabler, 1997), from a quantitative point of view. This paper introduces a principled way of making linguistic generalizations by detecting and eliminating syntactic and phonological redundancies in the data. As proof of concept, we first provide a small step-by-step example transforming a naive grammar over unsegmented words into a linguistically motivated grammar over morphemes, and then discuss a description of the English auxiliary system, passives, and raising verbs produced by a prototype implementation of a procedure for automated grammar optimization.
Probabilistic approaches have proven themselves well in learning phonological structure. In contrast, theoretical linguistics usually works with deterministic generalizations. The goal of this paper is to explore possible interactions between information-theoretic methods and deterministic linguistic knowledge and to examine some ways in which both can be used in tandem to extract phonological and morphophonological patterns from a small annotated dataset. Local and nonlocal processes in Mishar Tatar (Turkic/Kipchak) are examined as a case study.