Quality Estimation for Partially Subjective Classification Tasks via Crowdsourcing

Yoshinao Sato, Kouki Miyazawa


Abstract
The quality estimation of artifacts generated by creators via crowdsourcing has great significance for the construction of a large-scale data resource. A common approach to this problem is to ask multiple reviewers to evaluate the same artifacts. However, the commonly used majority voting method to aggregate reviewers’ evaluations does not work effectively for partially subjective or purely subjective tasks because reviewers’ sensitivity and bias of evaluation tend to have a wide variety. To overcome this difficulty, we propose a probabilistic model for subjective classification tasks that incorporates the qualities of artifacts as well as the abilities and biases of creators and reviewers as latent variables to be jointly inferred. We applied this method to the partially subjective task of speech classification into the following four attitudes: agreement, disagreement, stalling, and question. The result shows that the proposed method estimates the quality of speech more effectively than a vote aggregation, measured by correlation with a fine-grained classification by experts.
Anthology ID:
2020.lrec-1.29
Volume:
Proceedings of the 12th Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
229–235
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.29
DOI:
Bibkey:
Cite (ACL):
Yoshinao Sato and Kouki Miyazawa. 2020. Quality Estimation for Partially Subjective Classification Tasks via Crowdsourcing. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 229–235, Marseille, France. European Language Resources Association.
Cite (Informal):
Quality Estimation for Partially Subjective Classification Tasks via Crowdsourcing (Sato & Miyazawa, LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.29.pdf