Corporate Bankruptcy Prediction with Domain-Adapted BERT

Alex Gunwoo Kim, Sangwon Yoon


Abstract
This study performs BERT-based analysis, which is a representative contextualized language model, on corporate disclosure data to predict impending bankruptcies. Prior literature on bankruptcy prediction mainly focuses on developing more sophisticated prediction methodologies with financial variables. However, in our study, we focus on improving the quality of input dataset. Specifically, we employ BERT model to perform sentiment analysis on MD&A disclosures. We show that BERT outperforms dictionary-based predictions and Word2Vec-based predictions in terms of adjusted R-square in logistic regression, k-nearest neighbor (kNN-5), and linear kernel support vector machine (SVM). Further, instead of pre-training the BERT model from scratch, we apply self-learning with confidence-based filtering to corporate disclosure data (10-K). We achieve the accuracy rate of 91.56% and demonstrate that the domain adaptation procedure brings a significant improvement in prediction accuracy.
Anthology ID:
2021.econlp-1.4
Volume:
Proceedings of the Third Workshop on Economics and Natural Language Processing
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Udo Hahn, Veronique Hoste, Amanda Stent
Venue:
ECONLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
26–36
Language:
URL:
https://aclanthology.org/2021.econlp-1.4
DOI:
10.18653/v1/2021.econlp-1.4
Bibkey:
Cite (ACL):
Alex Gunwoo Kim and Sangwon Yoon. 2021. Corporate Bankruptcy Prediction with Domain-Adapted BERT. In Proceedings of the Third Workshop on Economics and Natural Language Processing, pages 26–36, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Corporate Bankruptcy Prediction with Domain-Adapted BERT (Kim & Yoon, ECONLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.econlp-1.4.pdf