Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding

Yuu Jinnai; Ukyo Honda; Tetsuro Morimura; Peinan Zhang

doi:10.18653/v1/2024.findings-acl.503

Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding

Yuu Jinnai, Ukyo Honda, Tetsuro Morimura, Peinan Zhang

Abstract

One of the most important challenges in text generation systems is to produce outputs that are not only correct but also diverse.Recently, Minimum Bayes-Risk (MBR) decoding has gained prominence for generating sentences of the highest quality among the decoding algorithms. However, existing algorithms proposed to generate diverse outputs are predominantly based on beam search or random sampling, thus their output quality is capped by these underlying decoding algorithms. In this paper, we investigate an alternative approach – we develop diversity-promoting decoding algorithms by enforcing diversity objectives to MBR decoding.We propose two variants of MBR; (i) Diverse MBR (DMBR) that adds a diversity penalty to the decoding objective and (ii) k-medoids MBR (KMBR) that reformulates the decoding task as a clustering problem.We evaluate DMBR and KMBR on a variety of directed text generation tasks using encoder-decoder models and a language model with prompting. The experimental results show that the proposed method achieves a better trade-off than the diverse beam search and sampling algorithms overall.

Anthology ID:: 2024.findings-acl.503
Volume:: Findings of the Association for Computational Linguistics: ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8494–8525
Language:
URL:: https://aclanthology.org/2024.findings-acl.503/
DOI:: 10.18653/v1/2024.findings-acl.503
Bibkey:
Cite (ACL):: Yuu Jinnai, Ukyo Honda, Tetsuro Morimura, and Peinan Zhang. 2024. Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding. In Findings of the Association for Computational Linguistics: ACL 2024, pages 8494–8525, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Generating Diverse and High-Quality Texts by Minimum Bayes Risk Decoding (Jinnai et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-acl.503.pdf

PDF Cite Search Fix data