Kazuhiro Arai


2025

pdf bib
Chakoshi: A Customizable Guardrail for LLMs with a Focus on Japanese-Language Moderation
Kazuhiro Arai | Ryota Matsui | Kenji Miyama | Yudai Yamamoto | Ren Shibamiya | Kaito Sugimoto | Yoshimasa Iwase
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

In this research, we developed and evaluated “chakoshi” an LLM guardrail model designed to address Japanese-specific nuances. chakoshi is a lightweight LLM that has been fine-tuned using multiple open datasets and proprietary learning datasets. Based on gemma-2-9b, the chakoshi model achieved an average F1 score of 0.92 or higher across multiple test datasets, demonstrating superior performance compared to existing models. Additionally, we implemented a feature that allows customization of categories to be filtered using natural language, and confirmed its effectiveness through practical examples.