Annotator Response Distributions as a Sampling Frame

Christopher Homan, Tharindu Cyril Weerasooriya, Lora Aroyo, Chris Welty


Abstract
Annotator disagreement is often dismissed as noise or the result of poor annotation process quality. Others have argued that it can be meaningful. But lacking a rigorous statistical foundation, the analysis of disagreement patterns can resemble a high-tech form of tea-leaf-reading. We contribute a framework for analyzing the variation of per-item annotator response distributions to data for humans-in-the-loop machine learning. We provide visualizations for, and use the framework to analyze the variance in, a crowdsourced dataset of hard-to-classify examples from the OpenImages archive.
Anthology ID:
2022.nlperspectives-1.8
Volume:
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Gavin Abercrombie, Valerio Basile, Sara Tonelli, Verena Rieser, Alexandra Uma
Venue:
NLPerspectives
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
56–65
Language:
URL:
https://aclanthology.org/2022.nlperspectives-1.8
DOI:
Bibkey:
Cite (ACL):
Christopher Homan, Tharindu Cyril Weerasooriya, Lora Aroyo, and Chris Welty. 2022. Annotator Response Distributions as a Sampling Frame. In Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022, pages 56–65, Marseille, France. European Language Resources Association.
Cite (Informal):
Annotator Response Distributions as a Sampling Frame (Homan et al., NLPerspectives 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.nlperspectives-1.8.pdf