MNLP at MEDIQA 2021: Fine-Tuning PEGASUS for Consumer Health Question Summarization

Jooyeon Lee, Huong Dang, Ozlem Uzuner, Sam Henry


Abstract
This paper details a Consumer Health Question (CHQ) summarization model submitted to MEDIQA 2021 for shared task 1: Question Summarization. Many CHQs are composed of multiple sentences with typos or unnecessary information, which can interfere with automated question answering systems. Question summarization mitigates this issue by removing this unnecessary information, aiding automated systems in generating a more accurate summary. Our summarization approach focuses on applying multiple pre-processing techniques, including question focus identification on the input and the development of an ensemble method to combine question focus with an abstractive summarization method. We use the state-of-art abstractive summarization model, PEGASUS (Pre-training with Extracted Gap-sentences for Abstractive Summarization), to generate abstractive summaries. Our experiments show that using our ensemble method, which combines abstractive summarization with question focus identification, improves performance over using summarization alone. Our model shows a ROUGE-2 F-measure of 11.14% against the official test dataset.
Anthology ID:
2021.bionlp-1.37
Volume:
Proceedings of the 20th Workshop on Biomedical Language Processing
Month:
June
Year:
2021
Address:
Online
Venues:
BioNLP | NAACL
SIG:
SIGBIOMED
Publisher:
Association for Computational Linguistics
Note:
Pages:
320–327
Language:
URL:
https://aclanthology.org/2021.bionlp-1.37
DOI:
10.18653/v1/2021.bionlp-1.37
Bibkey:
Cite (ACL):
Jooyeon Lee, Huong Dang, Ozlem Uzuner, and Sam Henry. 2021. MNLP at MEDIQA 2021: Fine-Tuning PEGASUS for Consumer Health Question Summarization. In Proceedings of the 20th Workshop on Biomedical Language Processing, pages 320–327, Online. Association for Computational Linguistics.
Cite (Informal):
MNLP at MEDIQA 2021: Fine-Tuning PEGASUS for Consumer Health Question Summarization (Lee et al., BioNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.bionlp-1.37.pdf
Data
MeQSum