Chakoshi: A Customizable Guardrail for LLMs with a Focus on Japanese-Language Moderation

Kazuhiro Arai, Ryota Matsui, Kenji Miyama, Yudai Yamamoto, Ren Shibamiya, Kaito Sugimoto, Yoshimasa Iwase


Abstract
In this research, we developed and evaluated “chakoshi” an LLM guardrail model designed to address Japanese-specific nuances. chakoshi is a lightweight LLM that has been fine-tuned using multiple open datasets and proprietary learning datasets. Based on gemma-2-9b, the chakoshi model achieved an average F1 score of 0.92 or higher across multiple test datasets, demonstrating superior performance compared to existing models. Additionally, we implemented a feature that allows customization of categories to be filtered using natural language, and confirmed its effectiveness through practical examples.
Anthology ID:
2025.ranlp-1.14
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
118–124
Language:
URL:
https://aclanthology.org/2025.ranlp-1.14/
DOI:
Bibkey:
Cite (ACL):
Kazuhiro Arai, Ryota Matsui, Kenji Miyama, Yudai Yamamoto, Ren Shibamiya, Kaito Sugimoto, and Yoshimasa Iwase. 2025. Chakoshi: A Customizable Guardrail for LLMs with a Focus on Japanese-Language Moderation. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 118–124, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Chakoshi: A Customizable Guardrail for LLMs with a Focus on Japanese-Language Moderation (Arai et al., RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.14.pdf