Mitigating Language Confusion through Inference-time Intervention

Xie Yunfan, Lixin Zou, Dan Luo, Min Tang, Chenliang Li, Xiangyang Luo, Liming Dong


Abstract
Although large language models (LLMs) trained on extensive multilingual corpora exhibit impressive language transfer, they often fail to respond in the user’s desired language due to corpus imbalances, an embarrassingly simple problem known as the language confusion. However, existing solutions like in-context learning and supervised fine-tuning (SFT) have drawbacks: in-context learning consumes context window space, diminishing attention as text lengthens, while SFT requires extensive, labor-intensive data collection. To overcome these limitations, we propose the language-sensitive intervention (LSI), a novel, lightweight, and label-free approach. Specifically, we analyze language confusion from a causal perspective, revealing that the training corpus’s language distribution acts as a confounder, disadvantaging languages that are underrepresented in the dataset. Then, we identify a language-sensitive dimension in the LLM’s residual stream, i.e., the language vector, which allows us to estimate the average causal effect of prompts on this dimension. During inference, we directly intervene on the language vector to generate responses in the desired language.To further advance research on this issue, we introduce a new benchmark that detects language confusion and assesses content quality. Experimental results demonstrate that our method effectively mitigates language confusion without additional complex mechanisms. Our code is available at https://github.com/SoseloX/LSI.
Anthology ID:
2025.coling-main.563
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8418–8431
Language:
URL:
https://aclanthology.org/2025.coling-main.563/
DOI:
Bibkey:
Cite (ACL):
Xie Yunfan, Lixin Zou, Dan Luo, Min Tang, Chenliang Li, Xiangyang Luo, and Liming Dong. 2025. Mitigating Language Confusion through Inference-time Intervention. In Proceedings of the 31st International Conference on Computational Linguistics, pages 8418–8431, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Mitigating Language Confusion through Inference-time Intervention (Yunfan et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.563.pdf