Massive-scale Decoding for Text Generation using Lattices

Jiacheng Xu, Siddhartha Jonnalagadda, Greg Durrett


Abstract
Conditional neural text generation models generate high-quality outputs, but often concentrate around a mode when what we really want is a diverse set of options. We present a search algorithm to construct lattices encoding a massive number of generation options. First, we restructure decoding as a best-first search, which explores the space differently than beam search and improves efficiency by avoiding pruning paths. Second, we revisit the idea of hypothesis recombination: we can identify pairs of similar generation candidates during search and merge them as an approximation. On both summarization and machine translation, we show that our algorithm encodes thousands of diverse options that remain grammatical and high-quality into one lattice. This algorithm provides a foundation for building downstream generation applications on top of massive-scale diverse outputs.
Anthology ID:
2022.naacl-main.344
Volume:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4659–4676
Language:
URL:
https://aclanthology.org/2022.naacl-main.344
DOI:
10.18653/v1/2022.naacl-main.344
Bibkey:
Cite (ACL):
Jiacheng Xu, Siddhartha Jonnalagadda, and Greg Durrett. 2022. Massive-scale Decoding for Text Generation using Lattices. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4659–4676, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Massive-scale Decoding for Text Generation using Lattices (Xu et al., NAACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.naacl-main.344.pdf
Code
 jiacheng-xu/lattice-generation
Data
WMT 2014