Improving Low-resource Question Answering by Augmenting Question Information

Andong Chen, Yuan Sun, Xiaobing Zhao, Rosella Galindo Esparza, Kehai Chen, Yang Xiang, Tiejun Zhao, Min Zhang


Abstract
In the era of large models, low-resource question-answering tasks lag, emphasizing the importance of data augmentation - a key research avenue in natural language processing. The main challenges include leveraging the large model’s internal knowledge for data augmentation, determining which QA data component - the question, passage, or answer - benefits most from augmentation, and retaining consistency in the augmented content without inducing excessive noise. To tackle these, we introduce PQQ, an innovative approach for question data augmentation consisting of Prompt Answer, Question Generation, and Question Filter. Our experiments reveal that ChatGPT underperforms on the experimental data, yet our PQQ method excels beyond existing augmentation strategies. Further, its universal applicability is validated through successful tests on high-resource QA tasks like SQUAD1.1 and TriviaQA.
Anthology ID:
2023.findings-emnlp.699
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10413–10420
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.699
DOI:
10.18653/v1/2023.findings-emnlp.699
Bibkey:
Cite (ACL):
Andong Chen, Yuan Sun, Xiaobing Zhao, Rosella Galindo Esparza, Kehai Chen, Yang Xiang, Tiejun Zhao, and Min Zhang. 2023. Improving Low-resource Question Answering by Augmenting Question Information. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 10413–10420, Singapore. Association for Computational Linguistics.
Cite (Informal):
Improving Low-resource Question Answering by Augmenting Question Information (Chen et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.699.pdf