Conversational QA Dataset Generation with Answer Revision

Seonjeong Hwang, Gary Geunbae Lee


Abstract
Conversational question-answer generation is a task that automatically generates a large-scale conversational question answering dataset based on input passages. In this paper, we introduce a novel framework that extracts question-worthy phrases from a passage and then generates corresponding questions considering previous conversations. In particular, our framework revises the extracted answers after generating questions so that answers exactly match paired questions. Experimental results show that our simple answer revision approach leads to significant improvement in the quality of synthetic data. Moreover, we prove that our framework can be effectively utilized for domain adaptation of conversational question answering.
Anthology ID:
2022.coling-1.140
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1636–1644
Language:
URL:
https://aclanthology.org/2022.coling-1.140
DOI:
Bibkey:
Cite (ACL):
Seonjeong Hwang and Gary Geunbae Lee. 2022. Conversational QA Dataset Generation with Answer Revision. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1636–1644, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
Conversational QA Dataset Generation with Answer Revision (Hwang & Lee, COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.140.pdf
Data
CoQADoQAQuAC