Interrater Disagreement Resolution: A Systematic Procedure to Reach Consensus in Annotation Tasks

Yvette Oortwijn, Thijs Ossenkoppele, Arianna Betti


Abstract
We present a systematic procedure for interrater disagreement resolution. The procedure is general, but of particular use in multiple-annotator tasks geared towards ground truth construction. We motivate our proposal by arguing that, barring cases in which the researchers’ goal is to elicit different viewpoints, interrater disagreement is a sign of poor quality in the design or the description of a task. Consensus among annotators, we maintain, should be striven for, through a systematic procedure for disagreement resolution such as the one we describe.
Anthology ID:
2021.humeval-1.15
Volume:
Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)
Month:
April
Year:
2021
Address:
Online
Editors:
Anya Belz, Shubham Agarwal, Yvette Graham, Ehud Reiter, Anastasia Shimorina
Venue:
HumEval
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
131–141
Language:
URL:
https://aclanthology.org/2021.humeval-1.15
DOI:
Bibkey:
Cite (ACL):
Yvette Oortwijn, Thijs Ossenkoppele, and Arianna Betti. 2021. Interrater Disagreement Resolution: A Systematic Procedure to Reach Consensus in Annotation Tasks. In Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval), pages 131–141, Online. Association for Computational Linguistics.
Cite (Informal):
Interrater Disagreement Resolution: A Systematic Procedure to Reach Consensus in Annotation Tasks (Oortwijn et al., HumEval 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.humeval-1.15.pdf
Video:
 https://www.youtube.com/watch?v=z-O6zZJDxOY
Video:
 https://aclanthology.org/2021.humeval-1.15.mp4
Code
 yoortwijn/humevaldisres