Nuanced Toxicity Detection in Spanish: A New Corpus and Benchmark Study

Alba María Mármol-Romero; Robiert Sepúlveda-Torres; Estela Saquete; María-Teresa Martín-Valdivia; L. Alfonso Ureña

Nuanced Toxicity Detection in Spanish: A New Corpus and Benchmark Study

Alba María Mármol-Romero, Robiert Sepúlveda-Torres, Estela Saquete, María-Teresa Martín-Valdivia, L. Alfonso Ureña

Abstract

The rise of toxic content on digital platforms has intensified the demand for automatic moderation tools. While English has benefited from large-scale annotated corpora, Spanish remains under-resourced, particularly for nuanced cases of toxicity such as irony, sarcasm, or indirect aggression. We present an extended version of the NECOS-TOX corpus, comprising 4,011 Spanish comments collected from 16 major news outlets. Each comment is annotated across three levels of toxicity (Non-Toxic, Slightly Toxic, and Toxic), following an iterative annotation protocol that achieved substantial inter-annotator agreement (k = 0.74). To reduce annotation costs while maintaining quality, we employed a human-in-the-loop active learning strategy, with manual correction of model pre-labels. We benchmarked the dataset with traditional machine learning (ML) methods, domain-specific transformers, and instruction-tuned large language models (LLMs). Results show that compact encoder models (e.g., RoBERTa-base-bne, 125M parameters) perform on par with much larger models (e.g., LLaMA-3.1-8B), underscoring the value of in-domain adaptation over raw scale. Our error analysis highlights persistent challenges in distinguishing subtle forms of toxicity, especially sarcasm and implicit insults, and reveals entity-related biases that motivate anonymization strategies. The dataset and trained models are released publicly.

Anthology ID:: 2026.findings-eacl.100
Volume:: Findings of the Association for Computational Linguistics: EACL 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1940–1954
Language:
URL:: https://aclanthology.org/2026.findings-eacl.100/
DOI:
Bibkey:
Cite (ACL):: Alba María Mármol-Romero, Robiert Sepúlveda-Torres, Estela Saquete, María-Teresa Martín-Valdivia, and L. Alfonso Ureña. 2026. Nuanced Toxicity Detection in Spanish: A New Corpus and Benchmark Study. In Findings of the Association for Computational Linguistics: EACL 2026, pages 1940–1954, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Nuanced Toxicity Detection in Spanish: A New Corpus and Benchmark Study (Mármol-Romero et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-eacl.100.pdf
Checklist:: 2026.findings-eacl.100.checklist.pdf

PDF Cite Search Checklist Fix data