Aligning NLP Models with Target Population Perspectives using PAIR: Population-Aligned Instance Replication

Stephanie Eckman; Bolei Ma; Christoph Kern; Rob Chew; Barbara Plank; Frauke Kreuter

doi:10.18653/v1/2025.nlperspectives-1.9

Aligning NLP Models with Target Population Perspectives using PAIR: Population-Aligned Instance Replication

Stephanie Eckman, Bolei Ma, Christoph Kern, Rob Chew, Barbara Plank, Frauke Kreuter

Abstract

Models trained on crowdsourced annotations may not reflect population views, if those who work as annotators do not represent the broader population. In this paper, we propose PAIR: Population-Aligned Instance Replication, a post-processing method that adjusts training data to better reflect target population characteristics without collecting additional annotations. Using simulation studies on offensive language and hate speech detection with varying annotator compositions, we show that non-representative pools degrade model calibration while leaving accuracy largely unchanged. PAIR corrects these calibration problems by replicating annotations from underrepresented annotator groups to match population proportions. We conclude with recommendations for improving the representativity of training data and model performance.

Anthology ID:: 2025.nlperspectives-1.9
Volume:: Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Gavin Abercrombie, Valerio Basile, Simona Frenda, Sara Tonelli, Shiran Dudy
Venues:: NLPerspectives | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 100–110
Language:
URL:: https://aclanthology.org/2025.nlperspectives-1.9/
DOI:: 10.18653/v1/2025.nlperspectives-1.9
Bibkey:
Cite (ACL):: Stephanie Eckman, Bolei Ma, Christoph Kern, Rob Chew, Barbara Plank, and Frauke Kreuter. 2025. Aligning NLP Models with Target Population Perspectives using PAIR: Population-Aligned Instance Replication. In Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP, pages 100–110, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Aligning NLP Models with Target Population Perspectives using PAIR: Population-Aligned Instance Replication (Eckman et al., NLPerspectives 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.nlperspectives-1.9.pdf

PDF Cite Search Fix data