Privacy-Preserving Federated Learning for Hate Speech Detection

Ivo de Souza Bueno Júnior; Haotian Ye; Axel Wisiorek; Hinrich Schütze

doi:10.18653/v1/2025.naacl-srw.13

Privacy-Preserving Federated Learning for Hate Speech Detection

Ivo de Souza Bueno Júnior, Haotian Ye, Axel Wisiorek, Hinrich Schütze

Abstract

This paper presents a federated learning system with differential privacy for hate speech detection, tailored to low-resource languages. By fine-tuning pre-trained language models, ALBERT emerged as the most effective option for balancing performance and privacy. Experiments demonstrated that federated learning with differential privacy performs adequately in low-resource settings, though datasets with fewer than 20 sentences per client struggled due to excessive noise. Balanced datasets and augmenting hateful data with non-hateful examples proved critical for improving model utility. These findings offer a scalable and privacy-conscious framework for integrating hate speech detection into social media platforms and browsers, safeguarding user privacy while addressing online harm.

Anthology ID:: 2025.naacl-srw.13
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)
Month:: April
Year:: 2025
Address:: Albuquerque, USA
Editors:: Abteen Ebrahimi, Samar Haider, Emmy Liu, Sammar Haider, Maria Leonor Pacheco, Shira Wein
Venues:: NAACL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 129–141
Language:
URL:: https://aclanthology.org/2025.naacl-srw.13/
DOI:: 10.18653/v1/2025.naacl-srw.13
Bibkey:
Cite (ACL):: Ivo de Souza Bueno Júnior, Haotian Ye, Axel Wisiorek, and Hinrich Schütze. 2025. Privacy-Preserving Federated Learning for Hate Speech Detection. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop), pages 129–141, Albuquerque, USA. Association for Computational Linguistics.
Cite (Informal):: Privacy-Preserving Federated Learning for Hate Speech Detection (de Souza Bueno Júnior et al., NAACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.naacl-srw.13.pdf

PDF Cite Search Fix data