Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines

Marta Sabou; Kalina Bontcheva; Leon Derczynski; Arno Scharl

Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines

Marta Sabou, Kalina Bontcheva, Leon Derczynski, Arno Scharl

Abstract

Crowdsourcing is an emerging collaborative approach that can be used for the acquisition of annotated corpora and a wide range of other linguistic resources. Although the use of this approach is intensifying in all its key genres (paid-for crowdsourcing, games with a purpose, volunteering-based approaches), the community still lacks a set of best-practice guidelines similar to the annotation best practices for traditional, expert-based corpus acquisition. In this paper we focus on the use of crowdsourcing methods for corpus acquisition and propose a set of best practice guidelines based in our own experiences in this area and an overview of related literature. We also introduce GATE Crowd, a plugin of the GATE platform that relies on these guidelines and offers tool support for using crowdsourcing in a more principled and efficient manner.

Anthology ID:: L14-1412
Volume:: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:: May
Year:: 2014
Address:: Reykjavik, Iceland
Editors:: Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:: 859–866
Language:
URL:: http://www.lrec-conf.org/proceedings/lrec2014/pdf/497_Paper.pdf
DOI:
Bibkey:
Cite (ACL):: Marta Sabou, Kalina Bontcheva, Leon Derczynski, and Arno Scharl. 2014. Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 859–866, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):: Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines (Sabou et al., LREC 2014)
Copy Citation:
PDF:: http://www.lrec-conf.org/proceedings/lrec2014/pdf/497_Paper.pdf
Video:: https://www.youtube.com/watch?v=cMl-z0-p0wI

PDF Cite Search Video Fix data