SEA-SafeguardBench: Culturally Grounded Safety Benchmark for Southeast Asian Languages

Panuthep Tasawong; Jian Gang Ngui; Alham Fikri Aji; Trevor Cohn; Peerat Limkonchotiwat

SEA-SafeguardBench: Culturally Grounded Safety Benchmark for Southeast Asian Languages

Panuthep Tasawong, Jian Gang Ngui, Alham Fikri Aji, Trevor Cohn, Peerat Limkonchotiwat

Abstract

Safeguard models help large language models (LLMs) detect and block harmful content, but most evaluations remain English-centric and overlook linguistic and cultural diversity. Existing multilingual safety benchmarks often rely on machine-translated English data, which fails to capture nuances in low-resource languages. Southeast Asian (SEA) languages are underrepresented despite the region’s linguistic diversity and unique safety concerns, from culturally sensitive political speech to region-specific misinformation. Addressing these gaps requires benchmarks that are natively authored to reflect local norms and harm scenarios. We introduce SEA-SafeguardBench, the first human-verified safety benchmark for SEA, covering eight languages, 21,640 samples, across three subsets: general, in-the-wild, and content generation. The experimental results from our benchmark demonstrate that even state-of-the-art LLMs and guardrails are challenged by SEA cultural and harm scenarios and underperform when compared to English texts.

Anthology ID:: 2026.findings-acl.194
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3973–4003
Language:
URL:: https://aclanthology.org/2026.findings-acl.194/
DOI:
Bibkey:
Cite (ACL):: Panuthep Tasawong, Jian Gang Ngui, Alham Fikri Aji, Trevor Cohn, and Peerat Limkonchotiwat. 2026. SEA-SafeguardBench: Culturally Grounded Safety Benchmark for Southeast Asian Languages. In Findings of the Association for Computational Linguistics: ACL 2026, pages 3973–4003, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: SEA-SafeguardBench: Culturally Grounded Safety Benchmark for Southeast Asian Languages (Tasawong et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.194.pdf
Checklist:: 2026.findings-acl.194.checklist.pdf

PDF Cite Search Checklist Fix data