Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data

Haolong Li, Yu Ma, Yinqi Zhang, Chen Ye, Jie Chen


Abstract
While large language models (LLMs) have shown excellent capabilities in language understanding, text generation and many other tasks, they still struggle in complex multi-step reasoning problems such as mathematical reasoning. In this paper, through a newly proposed arithmetical puzzle problem, we show that the model can perform well on multi-step reasoning tasks via fine tuning on high-quality synthetic data. Experiments with the open-llama-3B model on three different test datasets show that not only the model can reach a zero-shot pass@1 at 0.44 on the in-domain dataset, it also demonstrates certain generalization capabilities on the out-of-domain datasets. Specifically, this paper has designed two out-of-domain datasets in the form of extending the numerical range and the composing components of the arithmetical puzzle problem separately. The fine-tuned model have shown encouraging performance on these two far more difficult tasks with the zero-shot pass@1 at 0.33 and 0.35 correspondingly.
Anthology ID:
2024.findings-acl.55
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
936–946
Language:
URL:
https://aclanthology.org/2024.findings-acl.55
DOI:
Bibkey:
Cite (ACL):
Haolong Li, Yu Ma, Yinqi Zhang, Chen Ye, and Jie Chen. 2024. Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data. In Findings of the Association for Computational Linguistics ACL 2024, pages 936–946, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data (Li et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.55.pdf