How Do We Answer Complex Questions: Discourse Structure of Long-form Answers

Fangyuan Xu, Junyi Jessy Li, Eunsol Choi


Abstract
Long-form answers, consisting of multiple sentences, can provide nuanced and comprehensive answers to a broader set of questions. To better understand this complex and understudied task, we study the functional structure of long-form answers collected from three datasets, ELI5, WebGPT and Natural Questions. Our main goal is to understand how humans organize information to craft complex answers. We develop an ontology of six sentence-level functional roles for long-form answers, and annotate 3.9k sentences in 640 answer paragraphs. Different answer collection methods manifest in different discourse structures. We further analyze model-generated answers – finding that annotators agree less with each other when annotating model-generated answers compared to annotating human-written answers. Our annotated data enables training a strong classifier that can be used for automatic analysis. We hope our work can inspire future research on discourse-level modeling and evaluation of long-form QA systems.
Anthology ID:
2022.acl-long.249
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3556–3572
Language:
URL:
https://aclanthology.org/2022.acl-long.249
DOI:
10.18653/v1/2022.acl-long.249
Bibkey:
Cite (ACL):
Fangyuan Xu, Junyi Jessy Li, and Eunsol Choi. 2022. How Do We Answer Complex Questions: Discourse Structure of Long-form Answers. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3556–3572, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
How Do We Answer Complex Questions: Discourse Structure of Long-form Answers (Xu et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.249.pdf
Code
 utcsnlp/lfqa_discourse
Data
ELI5Natural Questions