You Make me Feel like a Natural Question: Training QA Systems on Transformed Trivia Questions

Tasnim Kabir, Yoo Yeon Sung, Saptarashmi Bandyopadhyay, Hao Zou, Abhranil Chandra, Jordan Boyd-Graber


Abstract
Training question-answering QA and information retrieval systems for web queries require large, expensive datasets that are difficult to annotate and time-consuming to gather. Moreover, while natural datasets of information-seeking questions are often prone to ambiguity or ill-formed, there are troves of freely available, carefully crafted question datasets for many languages. Thus, we automatically generate shorter, information-seeking questions, resembling web queries in the style of the Natural Questions (NQ) dataset from longer trivia data. Training a QA system on these transformed questions is a viable strategy for alternating to more expensive training setups showing the F1 score difference of less than six points and contrasting the final systems.
Anthology ID:
2024.emnlp-main.1140
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20486–20510
Language:
URL:
https://aclanthology.org/2024.emnlp-main.1140
DOI:
Bibkey:
Cite (ACL):
Tasnim Kabir, Yoo Yeon Sung, Saptarashmi Bandyopadhyay, Hao Zou, Abhranil Chandra, and Jordan Boyd-Graber. 2024. You Make me Feel like a Natural Question: Training QA Systems on Transformed Trivia Questions. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20486–20510, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
You Make me Feel like a Natural Question: Training QA Systems on Transformed Trivia Questions (Kabir et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.1140.pdf