The Discussion Tracker Corpus of Collaborative Argumentation

Christopher Olshefski, Luca Lugini, Ravneet Singh, Diane Litman, Amanda Godley


Abstract
Although NLP research on argument mining has advanced considerably in recent years, most studies draw on corpora of asynchronous and written texts, often produced by individuals. Few published corpora of synchronous, multi-party argumentation are available. The Discussion Tracker corpus, collected in high school English classes, is an annotated dataset of transcripts of spoken, multi-party argumentation. The corpus consists of 29 multi-party discussions of English literature transcribed from 985 minutes of audio. The transcripts were annotated for three dimensions of collaborative argumentation: argument moves (claims, evidence, and explanations), specificity (low, medium, high) and collaboration (e.g., extensions of and disagreements about others’ ideas). In addition to providing descriptive statistics on the corpus, we provide performance benchmarks and associated code for predicting each dimension separately, illustrate the use of the multiple annotations in the corpus to improve performance via multi-task learning, and finally discuss other ways the corpus might be used to further NLP research.
Anthology ID:
2020.lrec-1.130
Volume:
Proceedings of the 12th Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1033–1043
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.130
DOI:
Bibkey:
Cite (ACL):
Christopher Olshefski, Luca Lugini, Ravneet Singh, Diane Litman, and Amanda Godley. 2020. The Discussion Tracker Corpus of Collaborative Argumentation. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 1033–1043, Marseille, France. European Language Resources Association.
Cite (Informal):
The Discussion Tracker Corpus of Collaborative Argumentation (Olshefski et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.130.pdf