FriendsQA: Open-Domain Question Answering on TV Show Transcripts

Zhengzhe Yang, Jinho D. Choi


Abstract
This paper presents FriendsQA, a challenging question answering dataset that contains 1,222 dialogues and 10,610 open-domain questions, to tackle machine comprehension on everyday conversations. Each dialogue, involving multiple speakers, is annotated with several types of questions regarding the dialogue contexts, and the answers are annotated with certain spans in the dialogue. A series of crowdsourcing tasks are conducted to ensure good annotation quality, resulting a high inter-annotator agreement of 81.82%. A comprehensive annotation analytics is provided for a deeper understanding in this dataset. Three state-of-the-art QA systems are experimented, R-Net, QANet, and BERT, and evaluated on this dataset. BERT in particular depicts promising results, an accuracy of 74.2% for answer utterance selection and an F1-score of 64.2% for answer span selection, suggesting that the FriendsQA task is hard yet has a great potential of elevating QA research on multiparty dialogue to another level.
Anthology ID:
W19-5923
Volume:
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue
Month:
September
Year:
2019
Address:
Stockholm, Sweden
Venues:
SIGDIAL | WS
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
188–197
Language:
URL:
https://aclanthology.org/W19-5923
DOI:
10.18653/v1/W19-5923
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/W19-5923.pdf