Unsupervised Learning of PCFGs with Normalizing Flow

Lifeng Jin, Finale Doshi-Velez, Timothy Miller, Lane Schwartz, William Schuler


Abstract
Unsupervised PCFG inducers hypothesize sets of compact context-free rules as explanations for sentences. PCFG induction not only provides tools for low-resource languages, but also plays an important role in modeling language acquisition (Bannard et al., 2009; Abend et al. 2017). However, current PCFG induction models, using word tokens as input, are unable to incorporate semantics and morphology into induction, and may encounter issues of sparse vocabulary when facing morphologically rich languages. This paper describes a neural PCFG inducer which employs context embeddings (Peters et al., 2018) in a normalizing flow model (Dinh et al., 2015) to extend PCFG induction to use semantic and morphological information. Linguistically motivated sparsity and categorical distance constraints are imposed on the inducer as regularization. Experiments show that the PCFG induction model with normalizing flow produces grammars with state-of-the-art accuracy on a variety of different languages. Ablation further shows a positive effect of normalizing flow, context embeddings and proposed regularizers.
Anthology ID:
P19-1234
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2442–2452
Language:
URL:
https://aclanthology.org/P19-1234
DOI:
10.18653/v1/P19-1234
Bibkey:
Cite (ACL):
Lifeng Jin, Finale Doshi-Velez, Timothy Miller, Lane Schwartz, and William Schuler. 2019. Unsupervised Learning of PCFGs with Normalizing Flow. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2442–2452, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Learning of PCFGs with Normalizing Flow (Jin et al., ACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/P19-1234.pdf
Data
Penn TreebankUniversal Dependencies