Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

Xingdi Yuan; Tong Wang; Yen-Hsiang Wang; Emery Fine; Rania Abdelghani; Hélène Sauzéon; Pierre-Yves Oudeyer

doi:10.18653/v1/2023.findings-acl.820

Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

Xingdi Yuan, Tong Wang, Yen-Hsiang Wang, Emery Fine, Rania Abdelghani, Hélène Sauzéon, Pierre-Yves Oudeyer

Abstract

Large Language Models (LLMs) have in recent years demonstrated impressive prowess in natural language generation. A common practice to improve generation diversity is to sample multiple outputs from the model. However, partly due to the inaccessibility of LLMs, there lacks a simple and robust way of selecting the best output from these stochastic samples. As a case study framed in the context of question generation, we propose two prompt-based approaches, namely round-trip and prompt-based score, to selecting high-quality questions from a set of LLM-generated candidates. Our method works without the need to modify the underlying model, nor does it rely on human-annotated references — both of which are realistic constraints for real-world deployment of LLMs. With automatic as well as human evaluations, we empirically demonstrate that our approach can effectively select questions of higher qualities than greedy generation.

Anthology ID:: 2023.findings-acl.820
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 12952–12965
Language:
URL:: https://aclanthology.org/2023.findings-acl.820/
DOI:: 10.18653/v1/2023.findings-acl.820
Bibkey:
Cite (ACL):: Xingdi Yuan, Tong Wang, Yen-Hsiang Wang, Emery Fine, Rania Abdelghani, Hélène Sauzéon, and Pierre-Yves Oudeyer. 2023. Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 12952–12965, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation (Yuan et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-acl.820.pdf
Video:: https://aclanthology.org/2023.findings-acl.820.mp4

PDF Cite Search Video Fix data