Recursive Top-Down Production for Sentence Generation with Latent Trees

Shawn Tan, Yikang Shen, Alessandro Sordoni, Aaron Courville, Timothy J. O’Donnell


Abstract
We model the recursive production property of context-free grammars for natural and synthetic languages. To this end, we present a dynamic programming algorithm that marginalises over latent binary tree structures with N leaves, allowing us to compute the likelihood of a sequence of N tokens under a latent tree model, which we maximise to train a recursive neural function. We demonstrate performance on two synthetic tasks: SCAN, where it outperforms previous models on the LENGTH split, and English question formation, where it performs comparably to decoders with the ground-truth tree structure. We also present experimental results on German-English translation on the Multi30k dataset, and qualitatively analyse the induced tree structures our model learns for the SCAN tasks and the German-English translation task.
Anthology ID:
2020.findings-emnlp.208
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2291–2307
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.208
DOI:
10.18653/v1/2020.findings-emnlp.208
Bibkey:
Cite (ACL):
Shawn Tan, Yikang Shen, Alessandro Sordoni, Aaron Courville, and Timothy J. O’Donnell. 2020. Recursive Top-Down Production for Sentence Generation with Latent Trees. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2291–2307, Online. Association for Computational Linguistics.
Cite (Informal):
Recursive Top-Down Production for Sentence Generation with Latent Trees (Tan et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.208.pdf
Optional supplementary material:
 2020.findings-emnlp.208.OptionalSupplementaryMaterial.zip
Data
SCAN