Takuya Kato


2026

In Japan, the term "Gyaru-Mind" is commonly used to describe an upbeat mindset associated with gyaru culture, often linked to proactive positivity and strong self-affirmation. While it is widely regarded as beneficial, "Gyaru-Mind" lacks an academic operationalization and practical method for internalization. In this work, we define a quantitative index, "GYARU-MIDX", built from eight text-based factors, and implement a dialogue agent named GYARU-AI that uses this index in real time. During conversation, the agent estimates a user’s score and produces brief, context-appropriate replies by choosing between advice and empathy, so responses are not just positive all the time. A live "GYARU-MIDX" view provides real-time feedback for reflection and practice. The current system is Japanese-only because it is trained on Japanese "gyaru" style. We describe initial design and modeling results and outline limitations and next steps.

2025

This paper empirically investigates the relationship between subword vocabulary size and the performance of large language models (LLMs) to provide insights on how to define the vocabulary size. Experimental results show that larger vocabulary sizes lead to better performance in LLMs. Moreover, we consider a continual training scenario where a pre-trained language model is trained on a different target language. We introduce a simple method to use a new vocabulary instead of the pre-defined one. We show that using the new vocabulary outperforms the model with the vocabulary used in pre-training.