Suyash Sangwan


2022

pdf bib
TCS WITM 2022@FinSim4-ESG: Augmenting BERT with Linguistic and Semantic features for ESG data classification
Tushar Goel | Vipul Chauhan | Suyash Sangwan | Ishan Verma | Tirthankar Dasgupta | Lipika Dey
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

Advanced neural network architectures have provided several opportunities to develop systems to automatically capture information from domain-specific unstructured text sources. The FinSim4-ESG shared task, collocated with the FinNLP workshop, proposed two sub-tasks. In sub-task1, the challenge was to design systems that could utilize contextual word embeddings along with sustainability resources to elaborate an ESG taxonomy. In the second sub-task, participants were asked to design a system that could classify sentences into sustainable or unsustainable sentences. In this paper, we utilize semantic similarity features along with BERT embeddings to segregate domain terms into a fixed number of class labels. The proposed model not only considers the contextual BERT embeddings but also incorporates Word2Vec, cosine, and Jaccard similarity which gives word-level importance to the model. For sentence classification, several linguistic elements along with BERT embeddings were used as classification features. We have shown a detailed ablation study for the proposed models.

2020

pdf bib
Identifying pandemic-related stress factors from social-media posts – Effects on students and young-adults
Sachin Thukral | Suyash Sangwan | Arnab Chatterjee | Lipika Dey
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

The COVID-19 pandemic has thrown natural life out of gear across the globe. Strict measures are deployed to curb the spread of the virus that is causing it, and the most effective of them have been social isolation. This has led to wide-spread gloom and depression across society but more so among the young and the elderly. There are currently more than 200 million college students in 186 countries worldwide, affected due to the pandemic. The mode of education has changed suddenly, with the rapid adaptation of e-learning, whereby teaching is undertaken remotely and on digital platforms. This study presents insights gathered from social media posts that were posted by students and young adults during the COVID times. Using statistical and NLP techniques, we analyzed the behavioural issues reported by users themselves in their posts in depression related communities on Reddit. We present methodologies to systematically analyze content using linguistic techniques to find out the stress-inducing factors. Online education, losing jobs, isolation from friends and abusive families emerge as key stress factors