ThinkAnswer Loss: Balancing Semantic Similarity and Exact Matching for LLM Reasoning Enhancement

Shan Yang, Kun Wu, Zeju Li, Linlin Zhang, Xiangyu Pei, Leike An, Yu Liu


Abstract
Knowledge distillation for large language models often uses Chain-of-Thought (CoT) and answer pairs, but existing methods struggle with appropriate supervision signals. Uniform constraints (e.g., cross-entropy) on CoT can enforce literal, verbose reasoning and suppress expressive diversity, while solely semantic constraints on answers can reduce accuracy in classification tasks. This paper proposes ThinkAnswer Loss, an information-theoretic differential supervision framework that decouples CoT and answer supervision. ThinkAnswer Loss applies semantic similarity constraints to the CoT portion while maintaining strict literal matching for the answer. We theoretically demonstrate its connection to mutual information maximization and derive a tight upper bound on generalization error. Experimental validation on text quality assessment and mathematical reasoning tasks shows that our method maintains answer accuracy while effectively reducing CoT length and preserving semantic content, thereby accelerating inference.
Anthology ID:
2025.findings-emnlp.177
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3325–3347
Language:
URL:
https://aclanthology.org/2025.findings-emnlp.177/
DOI:
Bibkey:
Cite (ACL):
Shan Yang, Kun Wu, Zeju Li, Linlin Zhang, Xiangyu Pei, Leike An, and Yu Liu. 2025. ThinkAnswer Loss: Balancing Semantic Similarity and Exact Matching for LLM Reasoning Enhancement. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 3325–3347, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
ThinkAnswer Loss: Balancing Semantic Similarity and Exact Matching for LLM Reasoning Enhancement (Yang et al., Findings 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.findings-emnlp.177.pdf
Checklist:
 2025.findings-emnlp.177.checklist.pdf