Quality Control for Crowdsourced Bilingual Dictionary in Low-Resource Languages

Hiroki Chida, Yohei Murakami, Mondheera Pituxcoosuvarn


Abstract
In conventional bilingual dictionary creation by using crowdsourcing, the main method is to ask multiple workers to translate the same words or sentences and take a majority vote. However, when this method is applied to the creation of bilingual dictionaries for low-resource languages with few speakers, many low-quality workers are expected to participate in the majority voting, which makes it difficult to maintain the quality of the evaluation by the majority voting. Therefore, we apply an effective aggregation method using a hyper question, which is a set of single questions, for quality control. Furthermore, to select high-quality workers, we design a task-allocation method based on the reliability of workers which is evaluated by their work results.
Anthology ID:
2022.lrec-1.709
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6590–6596
Language:
URL:
https://aclanthology.org/2022.lrec-1.709
DOI:
Bibkey:
Cite (ACL):
Hiroki Chida, Yohei Murakami, and Mondheera Pituxcoosuvarn. 2022. Quality Control for Crowdsourced Bilingual Dictionary in Low-Resource Languages. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6590–6596, Marseille, France. European Language Resources Association.
Cite (Informal):
Quality Control for Crowdsourced Bilingual Dictionary in Low-Resource Languages (Chida et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.709.pdf