Evaluating Financial Literacy of Large Language Models through Domain Specific Languages for Plain Text Accounting

Alexei Gustavo Figueroa Rosero, Paul Grundmann, Julius Freidank, Wolfgang Nejdl, Alexander Loeser


Abstract
Large language models (LLMs) have proven highly effective for a wide range of tasks, including code generation. Recently, advancements in their capabilities have shown promise in areas like mathematical reasoning, chain-of-thought processes and self-reflection. However, their effectiveness in domains requiring nuanced understanding of financial contexts, such as accounting, remains unclear. In this study, we evaluate how well LLMs perform in generating code for domain-specific languages (DSLs) in accounting, using Beancount as a case study. We create a set of tasks based on common financial ratios, to evaluate the numeracy and financial literacy of LLMs. Our findings reveal that while LLMs are state-of-the art in generative tasks, they struggle severely with accounting, often producing inaccurate calculations and misinterpreting financial scenarios. We characterize these shortcomings through a comprehensive evaluation, shedding light on the limitations of LLMs in understanding and handling money-related tasks.
Anthology ID:
2025.finnlp-1.6
Volume:
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Chung-Chi Chen, Antonio Moreno-Sandoval, Jimin Huang, Qianqian Xie, Sophia Ananiadou, Hsin-Hsi Chen
Venues:
FinNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
63–75
Language:
URL:
https://aclanthology.org/2025.finnlp-1.6/
DOI:
Bibkey:
Cite (ACL):
Alexei Gustavo Figueroa Rosero, Paul Grundmann, Julius Freidank, Wolfgang Nejdl, and Alexander Loeser. 2025. Evaluating Financial Literacy of Large Language Models through Domain Specific Languages for Plain Text Accounting. In Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal), pages 63–75, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Evaluating Financial Literacy of Large Language Models through Domain Specific Languages for Plain Text Accounting (Figueroa Rosero et al., FinNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.finnlp-1.6.pdf