Assessing Crowdsourced Annotations with LLMs: Linguistic Certainty as a Proxy for Trustworthiness

Tianyi Li; Divya Sree; Tatiana Ringenberg

doi:10.18653/v1/2025.nlp4dh-1.16

Assessing Crowdsourced Annotations with LLMs: Linguistic Certainty as a Proxy for Trustworthiness

Tianyi Li, Divya Sree, Tatiana Ringenberg

Abstract

Human-annotated data is fundamental for training machine learning models, yet crowdsourced annotations often contain noise and bias. In this paper, we investigate the feasibility of employing large language models (LLMs), specifically GPT-4, as evaluators of crowdsourced annotations using a zero-shot prompting strategy. We introduce a certainty-based approach that leverages linguistic cues categorized into five levels (Absolute, High, Moderate, Low, Uncertain) based on Rubin’s framework—to assess the trustworthiness of LLM-generated evaluations. Using the MAVEN dataset as a case study, we compare GPT-4 evaluations against human evaluations and observe that the alignment between LLM and human judgments is strongly correlated with response certainty. Our results indicate that LLMs can effectively serve as a preliminary filter to flag potentially erroneous annotations for further expert review.

Anthology ID:: 2025.nlp4dh-1.16
Volume:: Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities
Month:: May
Year:: 2025
Address:: Albuquerque, USA
Editors:: Mika Hämäläinen, Emily Öhman, Yuri Bizzoni, So Miyagawa, Khalid Alnajjar
Venues:: NLP4DH | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 191–201
Language:
URL:: https://aclanthology.org/2025.nlp4dh-1.16/
DOI:: 10.18653/v1/2025.nlp4dh-1.16
Bibkey:
Cite (ACL):: Tianyi Li, Divya Sree, and Tatiana Ringenberg. 2025. Assessing Crowdsourced Annotations with LLMs: Linguistic Certainty as a Proxy for Trustworthiness. In Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, pages 191–201, Albuquerque, USA. Association for Computational Linguistics.
Cite (Informal):: Assessing Crowdsourced Annotations with LLMs: Linguistic Certainty as a Proxy for Trustworthiness (Li et al., NLP4DH 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.nlp4dh-1.16.pdf

PDF Cite Search Fix data