ODASim: Ordered, Distinctive and Absolute Semantic Similarity for Code Explanation Evaluation

Prince Kumar, Vitobha Munigala, Jaydeep Sen, Ashish Mittal, Vishwajeet Kumar, Srikanth G. Tamilselvam


Abstract
Code explanations are increasingly generated by large language models and used in software engineering workflows, making reliable evaluation essential. However, existing model-based and embedding-based methods often fail to distinguish correct explanations from partially or fully incorrect ones, and their similarity scores are poorly calibrated and do not reflect meaningful differences in explanation quality. To address this, we propose ODASim(Orderly, Dstinctive, and Absolute Similarity), a model-agnostic graded fine-tuning framework for embedding models that learns calibrated similarity representations between code and explanations. To support fine-grained supervision and evaluation, we also introduce ODA-X, a novel benchmark for code-to-explanation quality grading, comprising code–explanation pairs graded similarity labels derived from strategic perturbations of gold explanations. We apply our ODASim approach to multiple embedding models and evaluate it on two benchmarks: widely popular CodeXGLUE and our proposed benchmark ODA-X, spanning four programming languages - Python, Java, JavaScript, and Go. Results show that our method achieves up to 35% improvement in F1 score and 85% reduction in Expected Calibration Error (ECE), enabling reliable evaluation of code to explanation quality.
Anthology ID:
2026.findings-acl.1415
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
28390–28403
Language:
URL:
https://aclanthology.org/2026.findings-acl.1415/
DOI:
Bibkey:
Cite (ACL):
Prince Kumar, Vitobha Munigala, Jaydeep Sen, Ashish Mittal, Vishwajeet Kumar, and Srikanth G. Tamilselvam. 2026. ODASim: Ordered, Distinctive and Absolute Semantic Similarity for Code Explanation Evaluation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 28390–28403, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
ODASim: Ordered, Distinctive and Absolute Semantic Similarity for Code Explanation Evaluation (Kumar et al., Findings 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.findings-acl.1415.pdf
Checklist:
 2026.findings-acl.1415.checklist.pdf