Zhen Nie
2020
Easy, Reproducible and Quality-Controlled Data Collection with CROWDAQ
Qiang Ning
|
Hao Wu
|
Pradeep Dasigi
|
Dheeru Dua
|
Matt Gardner
|
Robert L. Logan IV
|
Ana Marasović
|
Zhen Nie
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
High-quality and large-scale data are key to success for AI systems. However, large-scale data annotation efforts are often confronted with a set of common challenges: (1) designing a user-friendly annotation interface; (2) training enough annotators efficiently; and (3) reproducibility. To address these problems, we introduce CROWDAQ, an open-source platform that standardizes the data collection pipeline with customizable user-interface components, automated annotator qualification, and saved pipelines in a re-usable format. We show that CROWDAQ simplifies data annotation significantly on a diverse set of data collection use cases and we hope it will be a convenient tool for the community.
Search
Co-authors
- Qiang Ning 1
- Hao Wu 1
- Pradeep Dasigi 1
- Dheeru Dua 1
- Matt Gardner 1
- show all...