Shweta Jain

2026

Council of LLMs: Evaluating Capability of Large Language Models to Annotate Propaganda
Vivek Sharma | Shweta Jain | Mohammad Shokri | Sarah Ita Levitan | Elena Filatova
The Proceedings for the 15th Workshop on Computational Approaches to Subjectivity, Sentiment Social Media Analysis (WASSA 2026)

Data annotation is essential for supervised natural language processing tasks but remains labor-intensive and expensive. Large language models (LLMs) have emerged as promising alternatives, capable of generating high-quality annotations either autonomously or in collaboration with human annotators. However their use in autonomous annotations is often questioned for their ethical take on subjective matters. This study investigates the effectiveness of LLMs in a autonomous, and hybrid annotation setups in propaganda detection. We evaluate GPT and open-source models on two datasets from different domains, namely, Propaganda Techniques Corpus (PTC) for news articles and the Journalist Media Bias on X (JMBX) for social media. Our results show that LLMs, in general, exhibit high recall but lower precision in detecting propaganda, often over-predicting persuasive content. Multi-annotator setups did not outperform the best models in single-annotator setting although it helped reasoning models boost their performance. Hybrid annotation, combining LLMs and human input, achieved the highest overall accuracy than LLM-only settings. We further analyze misclassifications and found that LLM have higher sensitivity towards certain propaganda techniques like loaded language, name calling, and doubt. Finally, using error typology analysis, we explore the reasoning provided on misclassifications by the LLM. Our result shows that although some studies report LLM outperforming manual annotations and it could prove useful in hybrid annotation, its incorporation in the human annotation pipeline must be implemented with caution.

pdf bib abs

The Impact of Highlighting Subjective Language on Perceived News Trustworthiness
Mohammad Shokri | Vivek Sharma | Emily Klapper | Shweta Jain | Elena Filatova | Sarah Ita Levitan
The Proceedings for the 15th Workshop on Computational Approaches to Subjectivity, Sentiment Social Media Analysis (WASSA 2026)

The rise of misinformation and opinionated articles has made understanding how misleading or biased content influences readers an increasingly important problem. While most prior work focuses on detecting misinformation or deceptive language in real time, far less attention has been paid to how such content is perceived by readers, which is an essential component of misinformation’s effectiveness. In this study, we examine whether highlighting subjective sentences in news articles affects perceived trustworthiness. Using a controlled user experiment and 1,334 article–reader evaluations, we find that highlighting subjective content produces a modest yet statistically significant decrease in trust, with substantial variation across articles and participants. To explain this variation, we model trust change after highlighting subjective language as a function of article-level linguistic features and reader-level attitudes. Our findings suggest that readers’ reactions to highlighted subjective language are driven primarily by characteristics of the text itself, and that highlighting subjective language offers benefits for may help readers better assess the reliability of potentially misleading news articles.

2024

pdf bib abs

Subjectivity Detection in English News using Large Language Models
Mohammad Shokri | Vivek Sharma | Elena Filatova | Shweta Jain | Sarah Levitan
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

Trust in media has reached a historical low as consumers increasingly doubt the credibility of the news they encounter. This growing skepticism is exacerbated by the prevalence of opinion-driven articles, which can influence readers’ beliefs to align with the authors’ viewpoints. In response to this trend, this study examines the expression of opinions in news by detecting subjective and objective language. We conduct an analysis of the subjectivity present in various news datasets and evaluate how different language models detect subjectivity and generalize to out-of-distribution data. We also investigate the use of in-context learning (ICL) within large language models (LLMs) and propose a straightforward prompting method that outperforms standard ICL and chain-of-thought (CoT) prompts.

Co-authors

Sarah Levitan 1

Venues

WASSA3
WS3

Fix author