RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library

Jiapeng Wang; Jinhao Jiang; Zhiqiang Zhang; Jun Zhou; Wayne Xin Zhao

RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library

Jiapeng Wang, Jinhao Jiang, Zhiqiang Zhang, Jun Zhou, Xin Zhao

Abstract

The advancement of reasoning capabilities in Large Language Models (LLMs) requires substantial amounts of high-quality reasoning data, particularly in mathematics. Existing data synthesis methods, such as data augmentation from annotated training sets or direct question generation based on relevant knowledge points and documents, have expanded datasets but face challenges in mastering the internal logic of the problem during generation and ensuring the verifiability of the solutions. To address these issues, we propose RV-Syn, a novel Rational and Verifiable mathematical Synthesis approach. RV-Syn first constructs a structured library of mathematical operations and then composes them into executable computational graphs, which serve as verifiable solution blueprints. These graphs are subsequently back-translated into complex problems, enabling solution-guided, logic-aware problem generation while inherently ensuring the verifiability of the solving process. Experimental results show RV-Syn surpasses existing synthesis methods, including those involving human-crafted problems. Our method achieves a 6.3% performance gain over the previous state-of-the-art synthetic data on LLaMA-3-8B and demonstrates superior data efficiency, outperforming others with only half the training data (50k vs. 100k), enabling a more scalable and robust reasoning dataset generation framework.

Anthology ID:: 2026.findings-eacl.93
Volume:: Findings of the Association for Computational Linguistics: EACL 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1812–1827
Language:
URL:: https://aclanthology.org/2026.findings-eacl.93/
DOI:
Bibkey:
Cite (ACL):: Jiapeng Wang, Jinhao Jiang, Zhiqiang Zhang, Jun Zhou, and Xin Zhao. 2026. RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library. In Findings of the Association for Computational Linguistics: EACL 2026, pages 1812–1827, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library (Wang et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-eacl.93.pdf
Checklist:: 2026.findings-eacl.93.checklist.pdf

PDF Cite Search Checklist Fix data