IISERB@LT-EDI-ACL2022: A Bag of Words and Document Embeddings Based Framework to Identify Severity of Depression Over Social Media

Tanmay Basu


Abstract
The DepSign-LT-EDI-ACL2022 shared task focuses on early prediction of severity of depression over social media posts. The BioNLP group at Department of Data Science and Engineering in Indian Institute of Science Education and Research Bhopal (IISERB) has participated in this challenge and submitted three runs based on three different text mining models. The severity of depression were categorized into three classes, viz., no depression, moderate, and severe and the data to build models were released as part of this shared task. The objective of this work is to identify relevant features from the given social media texts for effective text classification. As part of our investigation, we explored features derived from text data using document embeddings technique and simple bag of words model following different weighting schemes. Subsequently, adaptive boosting, logistic regression, random forest and support vector machine (SVM) classifiers were used to identify the scale of depression from the given texts. The experimental analysis on the given validation data show that the SVM classifier using the bag of words model following term frequency and inverse document frequency weighting scheme outperforms the other models for identifying depression. However, this framework could not achieve a place among the top ten runs of the shared task. This paper describes the potential of the proposed framework as well as the possible reasons behind mediocre performance on the given data.
Anthology ID:
2022.ltedi-1.33
Volume:
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Bharathi Raja Chakravarthi, B Bharathi, John P McCrae, Manel Zarrouk, Kalika Bali, Paul Buitelaar
Venue:
LTEDI
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
234–238
Language:
URL:
https://aclanthology.org/2022.ltedi-1.33
DOI:
10.18653/v1/2022.ltedi-1.33
Bibkey:
Cite (ACL):
Tanmay Basu. 2022. IISERB@LT-EDI-ACL2022: A Bag of Words and Document Embeddings Based Framework to Identify Severity of Depression Over Social Media. In Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion, pages 234–238, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
IISERB@LT-EDI-ACL2022: A Bag of Words and Document Embeddings Based Framework to Identify Severity of Depression Over Social Media (Basu, LTEDI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.ltedi-1.33.pdf
Video:
 https://aclanthology.org/2022.ltedi-1.33.mp4