Self-Ensemble of N-best Generation Hypotheses by Lexically Constrained Decoding

Ryota Miyano, Tomoyuki Kajiwara, Yuki Arase


Abstract
We propose a method that ensembles N-best hypotheses to improve natural language generation. Previous studies have achieved notable improvements in generation quality by explicitly reranking N-best candidates. These studies assume that there exists a hypothesis of higher quality. We expand the assumption to be more practical as there exist partly higher quality hypotheses in the N-best yet they may be imperfect as the entire sentences. By merging these high-quality fragments, we can obtain a higher-quality output than the single-best sentence. Specifically, we first obtain N-best hypotheses and conduct token-level quality estimation. We then apply tokens that should or should not be present in the final output as lexical constraints in decoding. Empirical experiments on paraphrase generation, summarisation, and constrained text generation confirm that our method outperforms the strong N-best reranking methods.
Anthology ID:
2023.emnlp-main.905
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14653–14661
Language:
URL:
https://aclanthology.org/2023.emnlp-main.905
DOI:
10.18653/v1/2023.emnlp-main.905
Bibkey:
Cite (ACL):
Ryota Miyano, Tomoyuki Kajiwara, and Yuki Arase. 2023. Self-Ensemble of N-best Generation Hypotheses by Lexically Constrained Decoding. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14653–14661, Singapore. Association for Computational Linguistics.
Cite (Informal):
Self-Ensemble of N-best Generation Hypotheses by Lexically Constrained Decoding (Miyano et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.905.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.905.mp4