Kaito Sugimoto
2025
Chakoshi: A Customizable Guardrail for LLMs with a Focus on Japanese-Language Moderation
Kazuhiro Arai
|
Ryota Matsui
|
Kenji Miyama
|
Yudai Yamamoto
|
Ren Shibamiya
|
Kaito Sugimoto
|
Yoshimasa Iwase
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
In this research, we developed and evaluated “chakoshi” an LLM guardrail model designed to address Japanese-specific nuances. chakoshi is a lightweight LLM that has been fine-tuned using multiple open datasets and proprietary learning datasets. Based on gemma-2-9b, the chakoshi model achieved an average F1 score of 0.92 or higher across multiple test datasets, demonstrating superior performance compared to existing models. Additionally, we implemented a feature that allows customization of categories to be filtered using natural language, and confirmed its effectiveness through practical examples.
2022
Incorporating the Rhetoric of Scientific Language into Sentence Embeddings using Phrase-guided Distant Supervision and Metric Learning
Kaito Sugimoto
|
Akiko Aizawa
Proceedings of the Third Workshop on Scholarly Document Processing
Communicative functions are an important rhetorical feature of scientific writing. Sentence embeddings that contain such features are highly valuable for the argumentative analysis of scientific documents, with applications in document alignment, recommendation, and academic writing assistance. Moreover, embeddings can provide a possible solution to the open-set problem, where models need to generalize to new communicative functions unseen at training time. However, existing sentence representation models are not suited for detecting functional similarity since they only consider lexical or semantic similarities. To remedy this, we propose a combined approach of distant supervision and metric learning to make a representation model more aware of the functional part of a sentence. We first leverage an existing academic phrase database to label sentences automatically with their functions. Then, we train an embedding model to capture similarities and dissimilarities from a rhetorical perspective. The experimental results demonstrate that the embeddings obtained from our model are more advantageous than existing models when retrieving functionally similar sentences. We also provide an extensive analysis of the performance differences between five metric learning objectives, revealing that traditional methods (e.g., softmax cross-entropy loss and triplet loss) outperform state-of-the-art techniques.
Search
Fix author
Co-authors
- Akiko Aizawa 1
- Kazuhiro Arai 1
- Yoshimasa Iwase 1
- Ryota Matsui 1
- Kenji Miyama 1
- show all...