Self-Training for Jointly Learning to Ask and Answer Questions

Mrinmaya Sachan, Eric Xing


Abstract
Building curious machines that can answer as well as ask questions is an important challenge for AI. The two tasks of question answering and question generation are usually tackled separately in the NLP literature. At the same time, both require significant amounts of supervised data which is hard to obtain in many domains. To alleviate these issues, we propose a self-training method for jointly learning to ask as well as answer questions, leveraging unlabeled text along with labeled question answer pairs for learning. We evaluate our approach on four benchmark datasets: SQUAD, MS MARCO, WikiQA and TrecQA, and show significant improvements over a number of established baselines on both question answering and question generation tasks. We also achieved new state-of-the-art results on two competitive answer sentence selection tasks: WikiQA and TrecQA.
Anthology ID:
N18-1058
Volume:
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Marilyn Walker, Heng Ji, Amanda Stent
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
629–640
Language:
URL:
https://aclanthology.org/N18-1058
DOI:
10.18653/v1/N18-1058
Bibkey:
Cite (ACL):
Mrinmaya Sachan and Eric Xing. 2018. Self-Training for Jointly Learning to Ask and Answer Questions. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 629–640, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Self-Training for Jointly Learning to Ask and Answer Questions (Sachan & Xing, NAACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/N18-1058.pdf
Data
MS MARCOWikiQA