Arjun Choudhry


2024

pdf bib
Evaluating Gender Bias in Multilingual Multimodal AI Models: Insights from an Indian Context
Kshitish Ghate | Arjun Choudhry | Vanya Bannihatti Kumar
Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

We evaluate gender biases in multilingual multimodal image and text models in two settings: text-to-image retrieval and text-to-image generation, to show that even seemingly gender-neutral traits generate biased results. We evaluate our framework in the context of people from India, working with two languages: English and Hindi. We work with frameworks built around mCLIP-based models to ensure a thorough evaluation of recent state-of-the-art models in the multilingual setting due to their potential for widespread applications. We analyze the results across 50 traits for retrieval and 8 traits for generation, showing that current multilingual multimodal models are biased towards men for most traits, and this problem is further exacerbated for lower-resource languages like Hindi. We further discuss potential reasons behind this observation, particularly stemming from the bias introduced by the pretraining datasets.

2022

pdf bib
Emotion-guided Cross-domain Fake News Detection using Adversarial Domain Adaptation
Arjun Choudhry | Inder Khatri | Arkajyoti Chakraborty | Dinesh Vishwakarma | Mukesh Prasad
Proceedings of the 19th International Conference on Natural Language Processing (ICON)

Recent works on fake news detection have shown the efficacy of using emotions as a feature or emotions-based features for improved performance. However, the impact of these emotion-guided features for fake news detection in cross-domain settings, where we face the problem of domain shift, is still largely unexplored. In this work, we evaluate the impact of emotion-guided features for cross-domain fake news detection, and further propose an emotion-guided, domain-adaptive approach using adversarial learning. We prove the efficacy of emotion-guided models in cross-domain settings for various combinations of source and target datasets from FakeNewsAMT, Celeb, Politifact and Gossipcop datasets.