2025
pdf
bib
abs
Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
Go Kamoda
|
Benjamin Heinzerling
|
Tatsuro Inaba
|
Keito Kudo
|
Keisuke Sakaguchi
|
Kentaro Inui
Findings of the Association for Computational Linguistics: NAACL 2025
According to the stages-of-inference hypothesis, early layers of language models map their subword-tokenized input, which does not necessarily correspond to a linguistically meaningful segmentation, to more meaningful representations that form the model’s “inner vocabulary”.Prior analysis of this *detokenization* stage has predominantly relied on probing and interventions such as path patching, which involve selecting particular inputs, choosing a subset of components that will be patched, and then observing changes in model behavior.Here, we show that several important aspects of the detokenization stage can be understood purely by analyzing model weights, without performing any model inference steps.Specifically, we introduce an analytical decomposition of first-layer attention in GPT-2.Our decomposition yields interpretable terms that quantify the relative contributions of position-related, token-related, and mixed effects.By focusing on terms in this decomposition, we discover weight-based explanations of attention bias toward close tokens and attention for detokenization.
pdf
bib
abs
How a Bilingual LM Becomes Bilingual: Tracing Internal Representations with Sparse Autoencoders
Tatsuro Inaba
|
Go Kamoda
|
Kentaro Inui
|
Masaru Isonuma
|
Yusuke Miyao
|
Yohei Oseki
|
Yu Takagi
|
Benjamin Heinzerling
Findings of the Association for Computational Linguistics: EMNLP 2025
This study explores how bilingual language models develop complex internal representations.We employ sparse autoencoders to analyze internal representations of bilingual language models with a focus on the effects of training steps, layers, and model sizes.Our analysis shows that language models first learn languages separately, and then gradually form bilingual alignments, particularly in the mid layers. We also found that this bilingual tendency is stronger in larger models.Building on these findings, we demonstrate the critical role of bilingual representations in model performance by employing a novel method that integrates decomposed representations from a fully trained model into a mid-training model.Our results provide insights into how language models acquire bilingual capabilities.
2023
pdf
bib
abs
MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting
Tatsuro Inaba
|
Hirokazu Kiyomaru
|
Fei Cheng
|
Sadao Kurohashi
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Large language models (LLMs) have achieved impressive performance on various reasoning tasks. To further improve the performance, we propose MultiTool-CoT, a novel framework that leverages chain-of-thought (CoT) prompting to incorporate multiple external tools, such as a calculator and a knowledge retriever, during the reasoning process. We apply MultiTool-CoT to the Task 2 dataset of NumGLUE, which requires both numerical reasoning and domain-specific knowledge. The experiments show that our method significantly outperforms strong baselines and achieves state-of-the-art performance.