PPDAC: A Plug-and -Play Data Augmentation Component for Few-shot Extractive Question Answering

Huang Qi (黄琪); Fu Han; Luo Wenbin; Wang Mingwen (王明文); Luo Kaiwei

PPDAC: A Plug-and -Play Data Augmentation Component for Few-shot Extractive Question Answering

Huang Qi, Fu Han, Luo Wenbin, Wang Mingwen, Luo Kaiwei

Abstract

“Extractive Question Answering (EQA) in the few-shot learning scenario is one of the most chal-lenging tasks of Machine Reading Comprehension (MRC). Some previous works employ exter-nal knowledge for data augmentation to improve the performance of few-shot extractive ques-tion answering. However, there are not always available external knowledge or language- anddomain-specific NLP tools to deal with external knowledge such as part-of-speech taggers, syn-tactic parsers, and named-entity recognizers. In this paper, we present a novel Plug-and-PlayData Augmentation Component (PPDAC) for the few-shot extractive question answering, whichincludes a paraphrase generator and a paraphrase selector. Specifically, we generate multipleparaphrases of the question in the (question, passage, answer) triples using the paraphrase gener-ator and then obtain highly similar statements via paraphrase selector to form more training datafor fine-tuning. Extensive experiments on multiple EQA datasets show that our proposed plug-and-play data augmentation component significantly improves question-answering performance,and consistently outperforms state-of-the-art approaches in few-shot settings by a large margin.”

Anthology ID:: 2024.ccl-1.102
Volume:: Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
Month:: July
Year:: 2024
Address:: Taiyuan, China
Editors:: Maosong Sun, Jiye Liang, Xianpei Han, Zhiyuan Liu, Yulan He
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 1320–1333
Language:: English
URL:: https://aclanthology.org/2024.ccl-1.102/
DOI:
Bibkey:
Cite (ACL):: Huang Qi, Fu Han, Luo Wenbin, Wang Mingwen, and Luo Kaiwei. 2024. PPDAC: A Plug-and -Play Data Augmentation Component for Few-shot Extractive Question Answering. In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference), pages 1320–1333, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):: PPDAC: A Plug-and -Play Data Augmentation Component for Few-shot Extractive Question Answering (Qi et al., CCL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.ccl-1.102.pdf

PDF Cite Search Fix data