Exploring the Effectiveness of Multilingual and Generative Large Language Models for Question Answering in Financial Texts

Ali Al-Laith

Exploring the Effectiveness of Multilingual and Generative Large Language Models for Question Answering in Financial Texts

Abstract

This paper investigates the use of large language models (LLMs) for financial causality detection in the FinCausal 2025 shared task, focusing on generative and multilingual question answering (QA) tasks. Our study employed both generative and discriminative approaches, utilizing GPT-4o for generative QA and BERT-base-multilingual-cased, XLM-RoBerta-large, and XLM-RoBerta-base for multilingual QA across English and Spanish datasets. The datasets consist of financial disclosures where questions reflect causal relationships, paired with extractive answers derived directly from the text. Evaluation was conducted using Semantic Answer Similarity (SAS) and Exact Match (EM) metrics. While the discriminative XLM-RoBerta-large model achieved the best overall performance, ranking 5th in English (SAS: 0.9598, EM: 0.7615) and 4th in Spanish (SAS: 0.9756, EM: 0.8084) among 11 team submissions, our results also highlight the effectiveness of the generative GPT-4o approach. Notably, GPT-4o achieved promising results in few-shot settings, with SAS scores approaching those of fine-tuned discriminative models, demonstrating that the generative approach can provide competitive performance despite lacking task-specific fine-tuning. This comparison underscores the potential of generative LLMs as robust, versatile alternatives for complex QA tasks like financial causality detection.

Anthology ID:: 2025.finnlp-1.23
Volume:: Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Chung-Chi Chen, Antonio Moreno-Sandoval, Jimin Huang, Qianqian Xie, Sophia Ananiadou, Hsin-Hsi Chen
Venues:: FinNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 230–235
Language:
URL:: https://aclanthology.org/2025.finnlp-1.23/
DOI:
Bibkey:
Cite (ACL):: Ali Al-Laith. 2025. Exploring the Effectiveness of Multilingual and Generative Large Language Models for Question Answering in Financial Texts. In Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal), pages 230–235, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Exploring the Effectiveness of Multilingual and Generative Large Language Models for Question Answering in Financial Texts (Al-Laith, FinNLP 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.finnlp-1.23.pdf

PDF Cite Search Fix data