Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs

Felix Drinkall, Janet B. Pierrehumbert, Stefan Zohren


Abstract
Large Language Models (LLMs) have been shown to perform well for many downstream tasks. Transfer learning can enable LLMs to acquire skills that were not targeted during pre-training. In financial contexts, LLMs can sometimes beat well-established benchmarks. This paper investigates how well LLMs perform at forecasting corporate credit ratings. We show that while LLMs are very good at encoding textual information, traditional methods are still very competitive when it comes to encoding numeric and multimodal data. For our task, current LLMs perform worse than a more traditional XGBoost architecture that combines fundamental and macroeconomic data with high-density text-based embedding features. We investigate the degree to which the text encoding methodology affects performance and interpretability.
Anthology ID:
2025.finnlp-1.11
Volume:
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Chung-Chi Chen, Antonio Moreno-Sandoval, Jimin Huang, Qianqian Xie, Sophia Ananiadou, Hsin-Hsi Chen
Venues:
FinNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
118–133
Language:
URL:
https://aclanthology.org/2025.finnlp-1.11/
DOI:
Bibkey:
Cite (ACL):
Felix Drinkall, Janet B. Pierrehumbert, and Stefan Zohren. 2025. Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs. In Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal), pages 118–133, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Forecasting Credit Ratings: A Case Study where Traditional Methods Outperform Generative LLMs (Drinkall et al., FinNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.finnlp-1.11.pdf