A New Dataset for Causality Identification in Argumentative Texts

Khalid Al Khatib, Michael Voelske, Anh Le, Shahbaz Syed, Martin Potthast, Benno Stein


Abstract
Existing datasets for causality identification in argumentative texts have several limitations, such as the type of input text (e.g., only claims), causality type (e.g., only positive), and the linguistic patterns investigated (e.g., only verb connectives). To resolve these limitations, we build the Webis-Causality-23 dataset, with sophisticated inputs (all units from arguments), a balanced distribution of causality types, and a larger number of linguistic patterns denoting causality. The dataset contains 1485 examples derived by combining the two paradigms of distant supervision and uncertainty sampling to identify diverse, high-quality samples of causality relations, and annotate them in a cost-effective manner.
Anthology ID:
2023.sigdial-1.31
Volume:
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Month:
September
Year:
2023
Address:
Prague, Czechia
Editors:
Svetlana Stoyanchev, Shafiq Joty, David Schlangen, Ondrej Dusek, Casey Kennington, Malihe Alikhani
Venue:
SIGDIAL
SIG:
SIGDIAL
Publisher:
Association for Computational Linguistics
Note:
Pages:
349–354
Language:
URL:
https://aclanthology.org/2023.sigdial-1.31
DOI:
10.18653/v1/2023.sigdial-1.31
Bibkey:
Cite (ACL):
Khalid Al Khatib, Michael Voelske, Anh Le, Shahbaz Syed, Martin Potthast, and Benno Stein. 2023. A New Dataset for Causality Identification in Argumentative Texts. In Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 349–354, Prague, Czechia. Association for Computational Linguistics.
Cite (Informal):
A New Dataset for Causality Identification in Argumentative Texts (Al Khatib et al., SIGDIAL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.sigdial-1.31.pdf