Yang Chong
2025
FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models
Shu Liu
|
Shangqing Zhao
|
Chenghao Jia
|
Xinlin Zhuang
|
Zhaoguang Long
|
Jie Zhou
|
Aimin Zhou
|
Man Lan
|
Yang Chong
Proceedings of the 31st International Conference on Computational Linguistics
Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of tasks. However, their proficiency and reliability in the specialized domain of financial data analysis, particularly focusing on data-driven thinking, remain uncertain. To bridge this gap, we introduce FinDABench, a comprehensive benchmark designed to evaluate the financial data analysis capabilities of LLMs within this context. The benchmark comprises 15,200 training instances and 8,900 test instances, all meticulously crafted by human experts. FinDABench assesses LLMs across three dimensions: 1) Core Ability, evaluating the models’ ability to perform financial indicator calculation and corporate sentiment risk assessment; 2) Analytical Ability, determining the models’ ability to quickly comprehend textual information and analyze abnormal financial reports; and 3) Technical Ability, examining the models’ use of technical knowledge to address real-world data analysis challenges involving analysis generation and charts visualization from multiple perspectives. We will release FinDABench, and the evaluation scripts at https://github.com/xxx. FinDABench aims to provide a measure for in-depth analysis of LLM abilities and foster the advancement of LLMs in the field of financial data analysis.
Search
Fix data
Co-authors
- Chenghao Jia 1
- Man Lan 1
- Shu Liu 1
- Zhaoguang Long 1
- Shangqing Zhao 1
- show all...