Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition

Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, Pengjun Xie


Abstract
Crowdsourcing is regarded as one prospective solution for effective supervised learning, aiming to build large-scale annotated training data by crowd workers. Previous studies focus on reducing the influences from the noises of the crowdsourced annotations for supervised models. We take a different point in this work, regarding all crowdsourced annotations as gold-standard with respect to the individual annotators. In this way, we find that crowdsourcing could be highly similar to domain adaptation, and then the recent advances of cross-domain methods can be almost directly applied to crowdsourcing. Here we take named entity recognition (NER) as a study case, suggesting an annotator-aware representation learning model that inspired by the domain adaptation methods which attempt to capture effective domain-aware features. We investigate both unsupervised and supervised crowdsourcing learning, assuming that no or only small-scale expert annotations are available. Experimental results on a benchmark crowdsourced NER dataset show that our method is highly effective, leading to a new state-of-the-art performance. In addition, under the supervised setting, we can achieve impressive performance gains with only a very small scale of expert annotations.
Anthology ID:
2021.acl-long.432
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Editors:
Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5558–5570
Language:
URL:
https://aclanthology.org/2021.acl-long.432
DOI:
10.18653/v1/2021.acl-long.432
Bibkey:
Cite (ACL):
Xin Zhang, Guangwei Xu, Yueheng Sun, Meishan Zhang, and Pengjun Xie. 2021. Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5558–5570, Online. Association for Computational Linguistics.
Cite (Informal):
Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition (Zhang et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.acl-long.432.pdf
Video:
 https://aclanthology.org/2021.acl-long.432.mp4
Code
 izhx/CLasDA
Data
CoNLL 2003