TCS WITM 2022@FinSim4-ESG: Augmenting BERT with Linguistic and Semantic features for ESG data classification

Tushar Goel, Vipul Chauhan, Suyash Sangwan, Ishan Verma, Tirthankar Dasgupta, Lipika Dey


Abstract
Advanced neural network architectures have provided several opportunities to develop systems to automatically capture information from domain-specific unstructured text sources. The FinSim4-ESG shared task, collocated with the FinNLP workshop, proposed two sub-tasks. In sub-task1, the challenge was to design systems that could utilize contextual word embeddings along with sustainability resources to elaborate an ESG taxonomy. In the second sub-task, participants were asked to design a system that could classify sentences into sustainable or unsustainable sentences. In this paper, we utilize semantic similarity features along with BERT embeddings to segregate domain terms into a fixed number of class labels. The proposed model not only considers the contextual BERT embeddings but also incorporates Word2Vec, cosine, and Jaccard similarity which gives word-level importance to the model. For sentence classification, several linguistic elements along with BERT embeddings were used as classification features. We have shown a detailed ablation study for the proposed models.
Anthology ID:
2022.finnlp-1.32
Volume:
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen
Venue:
FinNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
235–242
Language:
URL:
https://aclanthology.org/2022.finnlp-1.32
DOI:
10.18653/v1/2022.finnlp-1.32
Bibkey:
Cite (ACL):
Tushar Goel, Vipul Chauhan, Suyash Sangwan, Ishan Verma, Tirthankar Dasgupta, and Lipika Dey. 2022. TCS WITM 2022@FinSim4-ESG: Augmenting BERT with Linguistic and Semantic features for ESG data classification. In Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP), pages 235–242, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
TCS WITM 2022@FinSim4-ESG: Augmenting BERT with Linguistic and Semantic features for ESG data classification (Goel et al., FinNLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.finnlp-1.32.pdf