Ashwin Rao
2024
Reading Between the Tweets: Deciphering Ideological Stances of Interconnected Mixed-Ideology Communities
Zihao He
|
Ashwin Rao
|
Siyi Guo
|
Negar Mokhberian
|
Kristina Lerman
Findings of the Association for Computational Linguistics: EACL 2024
Recent advances in NLP have improved our ability to understand the nuanced worldviews of online communities. Existing research focused on probing ideological stances treats liberals and conservatives as separate groups. However, this fails to account for the nuanced views of the organically formed online communities and the connections between them. In this paper, we study discussions of the 2020 U.S. election on Twitter to identify complex interacting communities. Capitalizing on this interconnectedness, we introduce a novel approach that harnesses message passing when finetuning language models (LMs) to probe the nuanced ideologies of these communities. By comparing the responses generated by LMs and real-world survey results, our method shows higher alignment than existing baselines, highlighting the potential of using LMs in revealing complex ideologies within and across interconnected mixed-ideology communities.
Whose Emotions and Moral Sentiments do Language Models Reflect?
Zihao He
|
Siyi Guo
|
Ashwin Rao
|
Kristina Lerman
Findings of the Association for Computational Linguistics: ACL 2024
Language models (LMs) are known to represent the perspectives of some social groups better than others, which may impact their performance, especially on subjective tasks such as content moderation and hate speech detection. To explore how LMs represent different perspectives, existing research focused on positional alignment, i.e., how closely the models mimic the opinions and stances of different groups, e.g., liberals or conservatives. However, human communication also encompasses emotional and moral dimensions. We define the problem of affective alignment, which measures how LMs’ emotional and moral tone represents those of different groups. By comparing the affect of responses generated by 36 LMs to the affect of Twitter messages written by two ideological groups, we observe significant misalignment of LMs with both ideological groups. This misalignment is larger than the partisan divide in the U.S. Even after steering the LMs towards specific ideological perspectives, the misalignment and liberal tendencies of the model persist, suggesting a systemic bias within LMs.
Don’t Blame the Data, Blame the Model: Understanding Noise and Bias When Learning from Subjective Annotations
Abhishek Anand
|
Negar Mokhberian
|
Prathyusha Kumar
|
Anweasha Saha
|
Zihao He
|
Ashwin Rao
|
Fred Morstatter
|
Kristina Lerman
Proceedings of the 1st Workshop on Uncertainty-Aware NLP (UncertaiNLP 2024)
Researchers have raised awareness about the harms of aggregating labels especially in subjective tasks that naturally contain disagreements among human annotators. In this work we show that models that are only provided aggregated labels show low confidence on high-disagreement data instances. While previous studies consider such instances as mislabeled, we argue that the reason the high-disagreement text instances have been hard-to-learn is that the conventional aggregated models underperform in extracting useful signals from subjective tasks. Inspired by recent studies demonstrating the effectiveness of learning from raw annotations, we investigate classifying using Multiple Ground Truth (Multi-GT) approaches. Our experiments show an improvement of confidence for the high-disagreement instances.
Search
Fix data
Co-authors
- Zihao He 3
- Kristina Lerman 3
- Siyi Guo 2
- Negar Mokhberian 2
- Abhishek Anand 1
- show all...