Kyunghwan Sohn


pdf bib
CR-COPEC: Causal Rationale of Corporate Performance Changes to learn from Financial Reports
Ye Chun | Sunjae Kwon | Kyunghwan Sohn | Nakwon Sung | Junyoup Lee | Byoung Seo | Kevin Compher | Seung-won Hwang | Jaesik Choi
Findings of the Association for Computational Linguistics: EMNLP 2023

In this paper, we introduce CR-COPEC called Causal Rationale of Corporate Performance Changes from financial reports. This is a comprehensive large-scale domain-adaptation causal sentence dataset to detect financial performance changes of corporate. CR-COPEC contributes to two major achievements. First, it detects causal rationale from 10-K annual reports of the U.S. companies, which contain experts’ causal analysis following accounting standards in a formal manner. This dataset can be widely used by both individual investors and analysts as material information resources for investing and decision-making without tremendous effort to read through all the documents. Second, it carefully considers different characteristics which affect the financial performance of companies in twelve industries. As a result, CR-COPEC can distinguish causal sentences in various industries by taking unique narratives in each industry into consideration. We also provide an extensive analysis of how well CR-COPEC dataset is constructed and suited for classifying target sentences as causal ones with respect to industry characteristics.


pdf bib
The Global Banking Standards QA Dataset (GBS-QA)
Kyunghwan Sohn | Sunjae Kwon | Jaesik Choi
Proceedings of the Third Workshop on Economics and Natural Language Processing

A domain specific question answering (QA) dataset dramatically improves the machine comprehension performance. This paper presents a new Global Banking Standards QA dataset (GBS-QA) in the banking regulation domain. The GBS-QA has three values. First, it contains actual questions from market players and answers from global rule setter, the Basel Committee on Banking Supervision (BCBS) in the middle of creating and revising banking regulations. Second, financial regulation experts analyze and verify pairs of questions and answers in the annotation process. Lastly, the GBS-QA is a totally different dataset with existing datasets in finance and is applicable to stimulate transfer learning research in the banking regulation domain.