Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes

Mingyang Wang; Lukas Lange; Heike Adel; Yunpu Ma; Jannik Strötgen; Hinrich Schütze

doi:10.18653/v1/2025.emnlp-main.132

Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes

Mingyang Wang, Lukas Lange, Heike Adel, Yunpu Ma, Jannik Strötgen, Hinrich Schuetze

Abstract

Reasoning language models (RLMs) excel at complex tasks by leveraging a chain-of-thought process to generate structured intermediate steps. However, language mixing, i.e., reasoning steps containing tokens from languages other than the prompt, has been observed in their outputs and shown to affect performance, though its impact remains debated. We present the first systematic study of language mixing in RLMs, examining its patterns, impact, and internal causes across 15 languages, 7 task difficulty levels, and 18 subject areas, and show how all three factors influence language mixing. Moreover, we demonstrate that the choice of reasoning language significantly affects performance: forcing models to reason in Latin or Han scripts via constrained decoding notably improves accuracy. Finally, we show that the script composition of reasoning traces closely aligns with that of the model’s internal representations, indicating that language mixing reflects latent processing preferences in RLMs. Our findings provide actionable insights for optimizing multilingual reasoning and open new directions for reasoning language control to build more interpretable and adaptable RLMs.

Anthology ID:: 2025.emnlp-main.132
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2637–2665
Language:
URL:: https://aclanthology.org/2025.emnlp-main.132/
DOI:: 10.18653/v1/2025.emnlp-main.132
Bibkey:
Cite (ACL):: Mingyang Wang, Lukas Lange, Heike Adel, Yunpu Ma, Jannik Strötgen, and Hinrich Schuetze. 2025. Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 2637–2665, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes (Wang et al., EMNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.emnlp-main.132.pdf
Checklist:: 2025.emnlp-main.132.checklist.pdf

PDF Cite Search Checklist Fix data