Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus

Courtney Napoles, Joel Tetreault, Aasish Pappu, Enrica Rosato, Brian Provenzale


Abstract
This work presents a dataset and annotation scheme for the new task of identifying “good” conversations that occur online, which we call ERICs: Engaging, Respectful, and/or Informative Conversations. We develop a taxonomy to reflect features of entire threads and individual comments which we believe contribute to identifying ERICs; code a novel dataset of Yahoo News comment threads (2.4k threads and 10k comments) and 1k threads from the Internet Argument Corpus; and analyze the features characteristic of ERICs. This is one of the largest annotated corpora of online human dialogues, with the most detailed set of annotations. It will be valuable for identifying ERICs and other aspects of argumentation, dialogue, and discourse.
Anthology ID:
W17-0802
Volume:
Proceedings of the 11th Linguistic Annotation Workshop
Month:
April
Year:
2017
Address:
Valencia, Spain
Venues:
LAW | WS
SIG:
SIGANN
Publisher:
Association for Computational Linguistics
Note:
Pages:
13–23
Language:
URL:
https://aclanthology.org/W17-0802
DOI:
10.18653/v1/W17-0802
Bibkey:
Cite (ACL):
Courtney Napoles, Joel Tetreault, Aasish Pappu, Enrica Rosato, and Brian Provenzale. 2017. Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus. In Proceedings of the 11th Linguistic Annotation Workshop, pages 13–23, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus (Napoles et al., 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-0802.pdf
Code
 cnap/ynacc