Evaluating LLMs’ Ability to Understand Numerical Time Series for Text Generation

Mizuki Arai; Tatsuya Ishigaki; Masayuki Kawarada; Yusuke Miyao; Hiroya Takamura; Ichiro Kobayashi

Evaluating LLMs’ Ability to Understand Numerical Time Series for Text Generation

Mizuki Arai, Tatsuya Ishigaki, Masayuki Kawarada, Yusuke Miyao, Hiroya Takamura, Ichiro Kobayashi

Abstract

Data-to-text generation tasks often involve processing numerical time-series as input such as financial statistics or meteorological data. Although large language models (LLMs) are a powerful approach to data-to-text, we still lack a comprehensive understanding of how well they actually understand time-series data. We therefore introduce a benchmark with 18 evaluation tasks to assess LLMs’ abilities of interpreting numerical time-series, which are categorized into: 1) event detection—identifying maxima and minima; 2) computation—averaging and summation; 3) pairwise comparison—comparing values over time; and 4) inference—imputation and forecasting. Our experiments reveal five key findings: 1) even state-of-the-art LLMs struggle with complex multi-step reasoning; 2) tasks that require extracting values or performing computations within a specified range of the time-series significantly reduce accuracy; 3) instruction tuning offers inconsistent improvements for numerical interpretation; 4) reasoning-based models outperform standard LLMs in complex numerical tasks; and 5) LLMs perform interpolation better than forecasting. These results establish a clear baseline and serve as a wake-up call for anyone aiming to blend fluent language with trustworthy numeric precision in time-series scenarios.

Anthology ID:: 2025.inlg-main.16
Volume:: Proceedings of the 18th International Natural Language Generation Conference
Month:: October
Year:: 2025
Address:: Hanoi, Vietnam
Editors:: Lucie Flek, Shashi Narayan, Lê Hồng Phương, Jiahuan Pei
Venue:: INLG
SIG:: SIGGEN
Publisher:: Association for Computational Linguistics
Note:
Pages:: 232–248
Language:
URL:: https://aclanthology.org/2025.inlg-main.16/
DOI:
Bibkey:
Cite (ACL):: Mizuki Arai, Tatsuya Ishigaki, Masayuki Kawarada, Yusuke Miyao, Hiroya Takamura, and Ichiro Kobayashi. 2025. Evaluating LLMs’ Ability to Understand Numerical Time Series for Text Generation. In Proceedings of the 18th International Natural Language Generation Conference, pages 232–248, Hanoi, Vietnam. Association for Computational Linguistics.
Cite (Informal):: Evaluating LLMs’ Ability to Understand Numerical Time Series for Text Generation (Arai et al., INLG 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.inlg-main.16.pdf

PDF Cite Search Fix data