ArgAnalysis35K : A large-scale dataset for Argument Quality Analysis

Omkar Joshi, Priya Pitre, Yashodhara Haribhakta


Abstract
Argument Quality Detection is an emerging field in NLP which has seen significant recent development. However, existing datasets in this field suffer from a lack of quality, quantity and diversity of topics and arguments, specifically the presence of vague arguments that are not persuasive in nature. In this paper, we leverage a combined experience of 10+ years of Parliamentary Debating to create a dataset that covers significantly more topics and has a wide range of sources to capture more diversity of opinion. With 34,890 high-quality argument-analysis pairs (a term we introduce in this paper), this is also the largest dataset of its kind to our knowledge. In addition to this contribution, we introduce an innovative argument scoring system based on instance-level annotator reliability and propose a quantitative model of scoring the relevance of arguments to a range of topics.
Anthology ID:
2023.acl-long.778
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13916–13931
Language:
URL:
https://aclanthology.org/2023.acl-long.778
DOI:
10.18653/v1/2023.acl-long.778
Bibkey:
Cite (ACL):
Omkar Joshi, Priya Pitre, and Yashodhara Haribhakta. 2023. ArgAnalysis35K : A large-scale dataset for Argument Quality Analysis. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13916–13931, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
ArgAnalysis35K : A large-scale dataset for Argument Quality Analysis (Joshi et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.778.pdf
Video:
 https://aclanthology.org/2023.acl-long.778.mp4