Sushodhan Vaishampayan
2023
Audit Report Coverage Assessment using Sentence Classification
Sushodhan Vaishampayan
|
Nitin Ramrakhiyani
|
Sachin Pawar
|
Aditi Pawde
|
Manoj Apte
|
Girish Palshikar
Proceedings of the Sixth Workshop on Financial Technology and Natural Language Processing
Audit reports are a window to the financial health of a company and hence gauging coverage of various audit aspects in them is important. In this paper, we aim at determining an audit report’s coverage through classification of its sentences into multiple domain specific classes. In a weakly supervised setting, we employ a rule-based approach to automatically create training data for a BERT-based multi-label classifier. We then devise an ensemble to combine both the rule based and classifier approaches. Further, we employ two novel ways to improve the ensemble’s generalization: (i) through an active learning based approach and, (ii) through a LLM based review. We demonstrate that our proposed approaches outperform several baselines. We show utility of the proposed approaches to measure audit coverage on a large dataset of 2.8K audit reports.