Lukas Blübaum
2022
CausalQA: A Benchmark for Causal Question Answering
Alexander Bondarenko
|
Magdalena Wolska
|
Stefan Heindorf
|
Lukas Blübaum
|
Axel-Cyrille Ngonga Ngomo
|
Benno Stein
|
Pavel Braslavski
|
Matthias Hagen
|
Martin Potthast
Proceedings of the 29th International Conference on Computational Linguistics
At least 5% of questions submitted to search engines ask about cause-effect relationships in some way. To support the development of tailored approaches that can answer such questions, we construct Webis-CausalQA-22, a benchmark corpus of 1.1 million causal questions with answers. We distinguish different types of causal questions using a novel typology derived from a data-driven, manual analysis of questions from ten large question answering (QA) datasets. Using high-precision lexical rules, we extract causal questions of each type from these datasets to create our corpus. As an initial baseline, the state-of-the-art QA model UnifiedQA achieves a ROUGE-L F1 score of 0.48 on our new benchmark.
Search