Tanmay Basu


2022

pdf bib
IISERB@LT-EDI-ACL2022: A Bag of Words and Document Embeddings Based Framework to Identify Severity of Depression Over Social Media
Tanmay Basu
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion

The DepSign-LT-EDI-ACL2022 shared task focuses on early prediction of severity of depression over social media posts. The BioNLP group at Department of Data Science and Engineering in Indian Institute of Science Education and Research Bhopal (IISERB) has participated in this challenge and submitted three runs based on three different text mining models. The severity of depression were categorized into three classes, viz., no depression, moderate, and severe and the data to build models were released as part of this shared task. The objective of this work is to identify relevant features from the given social media texts for effective text classification. As part of our investigation, we explored features derived from text data using document embeddings technique and simple bag of words model following different weighting schemes. Subsequently, adaptive boosting, logistic regression, random forest and support vector machine (SVM) classifiers were used to identify the scale of depression from the given texts. The experimental analysis on the given validation data show that the SVM classifier using the bag of words model following term frequency and inverse document frequency weighting scheme outperforms the other models for identifying depression. However, this framework could not achieve a place among the top ten runs of the shared task. This paper describes the potential of the proposed framework as well as the possible reasons behind mediocre performance on the given data.
Search
Co-authors
    Venues