Rajarshi Roychoudhury


2022

pdf bib
The lack of theory is painful: Modeling Harshness in Peer Review Comments
Rajeev Verma | Rajarshi Roychoudhury | Tirthankar Ghosal
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The peer-review system has primarily remained the central process of all science communications. However, research has shown that the process manifests a power-imbalance scenario where the reviewer enjoys a position where their comments can be overly critical and wilfully obtuse without being held accountable. This brings into question the sanctity of the peer-review process, turning it into a fraught and traumatic experience for authors. A little more effort to still remain critical but be constructive in the feedback would help foster a progressive outcome from the peer-review process. In this paper, we argue to intervene at the step where this power imbalance actually begins in the system. To this end, we develop the first dataset of peer-review comments with their real-valued harshness scores. We build our dataset by using the popular Best-Worst-Scaling mechanism. We show the utility of our dataset for text moderation in peer reviews to make review reports less hurtful and more welcoming. We release our dataset and associated codes in https://github.com/Tirthankar-Ghosal/moderating-peer-review-harshness. Our research is one step towards helping create constructive peer-review reports.

pdf bib
A Novel Approach towards Cross Lingual Sentiment Analysis using Transliteration and Character Embedding
Rajarshi Roychoudhury | Subhrajit Dey | Md Akhtar | Amitava Das | Sudip Naskar
Proceedings of the 19th International Conference on Natural Language Processing (ICON)

Sentiment analysis with deep learning in resource-constrained languages is a challenging task. In this paper, we introduce a novel approach for sentiment analysis in resource-constrained scenarios using character embedding and cross-lingual sentiment analysis with transliteration. We use this method to introduce the novel task of inducing sentiment polarity of words and sentences and aspect term sentiment analysis in the no-resource scenario. We formulate this task by taking a metalingual approach whereby we transliterate data from closely related languages and transform it into a meta language. We also demonstrated the efficacy of using character-level embedding for sentence representation. We experimented with 4 Indian languages – Bengali, Hindi, Tamil, and Telugu, and obtained encouraging results. We also presented new state-of-the-art results on the Hindi sentiment analysis dataset leveraging our metalingual character embeddings.

2021

pdf bib
Fine-tuning BERT to classify COVID19 tweets containing symptoms
Rajarshi Roychoudhury | Sudip Naskar
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task

Twitter is a valuable source of patient-generated data that has been used in various population health studies. The first step in many of these studies is to identify and capture Twitter messages (tweets) containing medication mentions. Identifying personal mentions of COVID19 symptoms requires distinguishing personal mentions from other mentions such as symptoms reported by others and references to news articles or other sources. In this article, we describe our submission to Task 6 of the Social Media Mining for Health Applications (SMM4H) Shared Task 2021. This task challenged participants to classify tweets where the target classes are:(1) self-reports,(2) non-personal reports, and (3) literature/news mentions. Our system used a handcrafted preprocessing and word embeddings from BERT encoder model. We achieved an F1 score of 93%