Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations

Minh-Quang Pham, Sathish Indurthi, Shamil Chollampatt, Marco Turchi


Abstract
Large language models (LLMs) like ChatGPT can be expensive to train, deploy, and use for specific natural language generation tasks such as text summarization and for certain domains. A promising alternative is to fine-tune relatively smaller language models (LMs) on a particular task using high-quality, in-domain datasets. However, it can be prohibitively expensive to get such high-quality training data. This issue has been mitigated by generating weakly supervised data via knowledge distillation (KD) of LLMs. We propose a three-step approach to distill ChatGPT and fine-tune smaller LMs for summarizing forum conversations. More specifically, we design a method to selectively sample a large unannotated corpus of forum conversation using a semantic similarity metric. Then, we use the same metric to retrieve suitable prompts for ChatGPT from a small annotated validation set in the same domain. The generated dataset is then filtered to remove low-quality instances. Our proposed select-prompt-filter KD approach leads to significant improvements of up to 6.6 ROUGE-2 score by leveraging sufficient in-domain pseudo-labeled data over a standard KD approach given the same size of training data.
Anthology ID:
2023.emnlp-main.753
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12257–12265
Language:
URL:
https://aclanthology.org/2023.emnlp-main.753
DOI:
10.18653/v1/2023.emnlp-main.753
Bibkey:
Cite (ACL):
Minh-Quang Pham, Sathish Indurthi, Shamil Chollampatt, and Marco Turchi. 2023. Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12257–12265, Singapore. Association for Computational Linguistics.
Cite (Informal):
Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations (Pham et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-main.753.pdf
Video:
 https://aclanthology.org/2023.emnlp-main.753.mp4