LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation

Junyeong Park; Seogyeong Jeong; Seyoung Song; Yohan Lee; Alice Oh

doi:10.18653/v1/2025.c3nlp-1.7

LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation

Junyeong Park, Seogyeong Jeong, Seyoung Song, Yohan Lee, Alice Oh

Abstract

Content moderation platforms concentrate resources on English content despite serving predominantly non-English speaking users.Also, given the scarcity of native moderators for low-resource languages, non-native moderators must bridge this gap in moderation tasks such as hate speech moderation.Through a user study, we identify that non-native moderators struggle with understanding culturally-specific knowledge, sentiment, and internet culture in the hate speech.To assist non-native moderators, we present LLM-C3MOD, a human-LLM collaborative pipeline with three steps: (1) RAG-enhanced cultural context annotations; (2) initial LLM-based moderation; and (3) targeted human moderation for cases lacking LLM consensus.Evaluated on Korean hate speech dataset with Indonesian and German participants, our system achieves 78% accuracy (surpassing GPT-4o’s 71% baseline) while reducing human workload by 83.6%.In addition, cultural context annotations improved non-native moderator accuracy from 22% to 61%, with humans notably excelling at nuanced tasks where LLMs struggle.Our findings demonstrate that non-native moderators, when properly supported by LLMs, can effectively contribute to cross-cultural hate speech moderation.

Anthology ID:: 2025.c3nlp-1.7
Volume:: Proceedings of the 3rd Workshop on Cross-Cultural Considerations in NLP (C3NLP 2025)
Month:: May
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Vinodkumar Prabhakaran, Sunipa Dev, Luciana Benotti, Daniel Hershcovich, Yong Cao, Li Zhou, Laura Cabello, Ife Adebara
Venues:: C3NLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 71–88
Language:
URL:: https://aclanthology.org/2025.c3nlp-1.7/
DOI:: 10.18653/v1/2025.c3nlp-1.7
Bibkey:
Cite (ACL):: Junyeong Park, Seogyeong Jeong, Seyoung Song, Yohan Lee, and Alice Oh. 2025. LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation. In Proceedings of the 3rd Workshop on Cross-Cultural Considerations in NLP (C3NLP 2025), pages 71–88, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: LLM-C3MOD: A Human-LLM Collaborative System for Cross-Cultural Hate Speech Moderation (Park et al., C3NLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.c3nlp-1.7.pdf

PDF Cite Search Fix data