Aggregating and Learning from Multiple Annotators

Silviu Paun, Edwin Simpson


Abstract
The success of NLP research is founded on high-quality annotated datasets, which are usually obtained from multiple expert annotators or crowd workers. The standard practice to training machine learning models is to first adjudicate the disagreements and then perform the training. To this end, there has been a lot of work on aggregating annotations, particularly for classification tasks. However, many other tasks, particularly in NLP, have unique characteristics not considered by standard models of annotation, e.g., label interdependencies in sequence labelling tasks, unrestricted labels for anaphoric annotation, or preference labels for ranking texts. In recent years, researchers have picked up on this and are covering the gap. A first objective of this tutorial is to connect NLP researchers with state-of-the-art aggregation models for a diverse set of canonical language annotation tasks. There is also a growing body of recent work arguing that following the convention and training with adjudicated labels ignores any uncertainty the labellers had in their classifications, which results in models with poorer generalisation capabilities. Therefore, a second objective of this tutorial is to teach NLP workers how they can augment their (deep) neural models to learn from data with multiple interpretations.
Anthology ID:
2021.eacl-tutorials.2
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts
Month:
April
Year:
2021
Address:
online
Editors:
Isabelle Augenstein, Ivan Habernal
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6–9
Language:
URL:
https://aclanthology.org/2021.eacl-tutorials.2
DOI:
10.18653/v1/2021.eacl-tutorials.2
Bibkey:
Cite (ACL):
Silviu Paun and Edwin Simpson. 2021. Aggregating and Learning from Multiple Annotators. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts, pages 6–9, online. Association for Computational Linguistics.
Cite (Informal):
Aggregating and Learning from Multiple Annotators (Paun & Simpson, EACL 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.eacl-tutorials.2.pdf