Learning to Compare Financial Reports for Financial Forecasting

Ross Koval, Nicholas Andrews, Xifeng Yan


Abstract
Public companies in the US are required to publish annual reports that detail their recent financial performance, present the current state of ongoing business operations, and discuss future prospects. However, they typically contain over 25,000 words across all sections, large amounts of industry and legal jargon, and a high percentage of boilerplate content that does not change much year-to-year. These unique characteristics present challenges for many generic pretrained language models because it is likely that only a small percentage of the long report that reflects salient information contains meaningful signal about the future prospects of the company. In this work, we curate a large-scale dataset of paired financial reports and introduce two novel, challenging tasks of predicting long-horizon company risk and correlation that evaluate the ability of the model to recognize cross-document relationships with complex, nuanced signals. We explore and present a comprehensive set of methods and experiments, and establish strong baselines designed to learn to identify subtle similarities and differences between long documents. Furthermore, we demonstrate that it is possible to predict company risk and correlation solely from the text of their financial reports and further that modeling the cross-document interactions at a fine-grained level provides significant benefit. Finally, we probe the best performing model through quantitative and qualitative interpretability methods to reveal some insight into the underlying task signal.
Anthology ID:
2024.findings-eacl.34
Volume:
Findings of the Association for Computational Linguistics: EACL 2024
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
500–512
Language:
URL:
https://aclanthology.org/2024.findings-eacl.34
DOI:
Bibkey:
Cite (ACL):
Ross Koval, Nicholas Andrews, and Xifeng Yan. 2024. Learning to Compare Financial Reports for Financial Forecasting. In Findings of the Association for Computational Linguistics: EACL 2024, pages 500–512, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Learning to Compare Financial Reports for Financial Forecasting (Koval et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-eacl.34.pdf