Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization

Sahil Wadhwa, Chengtian Xu, Haoming Chen, Aakash Mahalingam, Akankshya Kar, Divya Chaudhary


Abstract
The automatic generation of counter-speech (CS) is a critical strategy for addressing hate speech by providing constructive and informed responses. However, existing methods often fail to generate high-quality, impactful, and scalable CS, particularly across diverse lin- guistic contexts. In this paper, we propose a novel methodology to enhance CS generation by aligning Large Language Models (LLMs) using Supervised Fine-Tuning (SFT) and Di- rect Preference Optimization (DPO). Our ap- proach leverages DPO to align LLM outputs with human preferences, ensuring contextu- ally appropriate and linguistically adaptable responses. Additionally, we incorporate knowl- edge grounding to enhance the factual accuracy and relevance of generated CS. Experimental results demonstrate that DPO-aligned models significantly outperform SFT baselines on CS benchmarks while scaling effectively to mul- tiple languages. These findings highlight the potential of preference-based alignment tech- niques to advance CS generation across var- ied linguistic settings. The model supervision and alignment is done in English and the same model is used for reporting metrics across other languages like Basque, Italian, and Spanish.
Anthology ID:
2025.mcg-1.3
Volume:
Proceedings of the First Workshop on Multilingual Counterspeech Generation
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Helena Bonaldi, María Estrella Vallecillo-Rodríguez, Irune Zubiaga, Arturo Montejo-Ráez, Aitor Soroa, María Teresa Martín-Valdivia, Marco Guerini, Rodrigo Agerri
Venues:
MCG | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
19–28
Language:
URL:
https://aclanthology.org/2025.mcg-1.3/
DOI:
Bibkey:
Cite (ACL):
Sahil Wadhwa, Chengtian Xu, Haoming Chen, Aakash Mahalingam, Akankshya Kar, and Divya Chaudhary. 2025. Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization. In Proceedings of the First Workshop on Multilingual Counterspeech Generation, pages 19–28, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization (Wadhwa et al., MCG 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.mcg-1.3.pdf