Amitava Das - ACL Anthology

Amitava Das

2025

SEPSIS: I Can Catch Your Lies – A New Paradigm for Deception Detection
Anku Rani | Dwip Dalal | Shreya Gautam | Pankaj Gupta | Vinija Jain | Aman Chadha | Amit Sheth | Amitava Das
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)

Deception is the intentional practice of twisting information. It is a nuanced societal practice deeply intertwined with human societal evolution, characterized by a multitude of facets. This research explores the problem of deception through the lens of psychology, employing a framework that categorizes deception into three forms: lies of omission, lies of commission, and lies of influence. The primary focus of this study is specifically on investigating only lies of omission. We propose a novel framework for deception detection leveraging NLP techniques. We curated an annotated dataset of 876,784 samples by amalgamating a popular large-scale fake news dataset and scraped news headlines from the Twitter handle of “Times of India”, a well-known Indian news media house. Each sample has been labeled with four layers, namely: (i) the type of omission (speculation, bias, distortion, sounds factual, and opinion), (ii) colors of lies (black, white, grey, and red), and (iii) the intention of such lies (to influence, gain social prestige, etc) (iv) topic of lies (political, educational, religious, racial, and ethnicity). We present a novel multi-task learning [MTL] pipeline that leverages the dataless merging of fine-tuned language models to address the deception detection task mentioned earlier. Our proposed model achieved an impressive F1 score of 0.87, demonstrating strong performance across all layers including the type, color, intent, and topic aspects of deceptive content. Finally, our research aims to explore the relationship between the lies of omission and propaganda techniques. To accomplish this, we conducted an in-depth analysis, uncovering compelling findings. For instance, our analysis revealed a significant correlation between loaded language and opinion, shedding light on their interconnectedness. To encourage further research in this field, we are releasing the SEPSIS dataset and code at https://huggingface.co/datasets/ankurani/deception.

LLMsAgainstHate@NLU of Devanagari Script Languages 2025: Hate Speech Detection and Target Identification in Devanagari Languages via Parameter Efficient Fine-Tuning of LLMs
Rushendra Sidibomma | Pransh Patwa | Parth Patwa | Aman Chadha | Vinija Jain | Amitava Das
Proceedings of the First Workshop on Challenges in Processing South Asian Languages (CHiPSAL 2025)

The detection of hate speech has become increasingly important in combating online hostility and its real-world consequences. Despite recent advancements, there is limited research addressing hate speech detection in Devanagari-scripted languages, where resources and tools are scarce. While large language models (LLMs) have shown promise in language-related tasks, traditional fine-tuning approaches are often infeasible given the size of the models. In this paper, we propose a Parameter Efficient Fine tuning (PEFT) based solution for hate speech detection and target identification. We evaluate multiple LLMs on the Devanagari dataset provided by Thapa et al. (2025), which contains annotated instances in 2 languages - Hindi and Nepali. The results demonstrate the efficacy of our approach in handling Devanagari-scripted content. Code will be made publicly available on GitHub following acceptance.

KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting
Thilini Wijesiriwardene | Ruwan Wickramarachchi | Sreeram Reddy Vennam | Vinija Jain | Aman Chadha | Amitava Das | Ponnurangam Kumaraguru | Amit Sheth
Proceedings of the 31st International Conference on Computational Linguistics

Making analogies is fundamental to cognition. Proportional analogies, which consist of four terms, are often used to assess linguistic and cognitive abilities. For instance, completing analogies like “Oxygen is to Gas as < blank > is to < blank >" requires identifying the semantic relationship (e.g., “type of”) between the first pair of terms (“Oxygen” and “Gas”) and finding a second pair that shares the same relationship (e.g., “Aluminum” and “Metal”). In this work, we introduce a 15K Multiple-Choice Question Answering (MCQA) dataset for proportional analogy completion and evaluate the performance of contemporary Large Language Models (LLMs) in various knowledge-enhanced prompt settings. Specifically, we augment prompts with three types of knowledge: exemplar, structured, and targeted. Our results show that despite extensive training data, solving proportional analogies remains challenging for current LLMs, with the best model achieving an accuracy of 55%. Notably, we find that providing targeted knowledge can better assist models in completing proportional analogies compared to providing exemplars or collections of structured knowledge. Our code and data are available at: https://github.com/Thiliniiw/KnowledgePrompts/

Alignment is no longer a luxury; it is a necessity. As large language models (LLMs) enter high-stakes domains like education, healthcare, governance, and law, their behavior must reliably reflect human-aligned values and safety constraints. Yet current evaluations rely heavily on behavioral proxies such as refusal rates, G-Eval scores, and toxicity classifiers, all of which have critical blind spots. Aligned models are often vulnerable to jailbreaking, stochasticity of generation, and alignment faking. To address this issue, we introduce the **Alignment Quality Index (AQI)**. This novel geometric and prompt-invariant metric empirically assesses LLM alignment by analyzing the separation of safe and unsafe activations in latent space. By combining measures such as the *Davies-Bouldin score (DBS)*, *Dunn index (DI)*, *Xie-Beni index (XBI)*, and *Calinski-Harabasz index (CHI)* across various formulations, AQI captures clustering quality to detect hidden misalignments and jailbreak risks, even when outputs appear compliant. AQI also serves as an early warning signal for alignment faking, offering a robust, decoding-invariant tool for behavior-agnostic safety auditing. Additionally, we propose the **LITMUS** dataset to facilitate robust evaluation under these challenging conditions. Empirical tests on LITMUS across different models trained under DPO, GRPO, and RLHF conditions demonstrate AQI’s correlation with external judges and ability to reveal vulnerabilities missed by refusal metrics. We make our implementation publicly available to foster future research in this area.

DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization
Amitava Das | Suranjana Trivedy | Danush Khanna | Yaswanth Narsupalli | Basab Ghosh | Rajarshi Roy | Gurpreet Singh | Vinija Jain | Vasu Sharma | Aishwarya Naresh Reganti | Aman Chadha
Findings of the Association for Computational Linguistics: ACL 2025

The rapid advancement of large language models (LLMs) has revolutionized numerous applications, but presents significant challenges in aligning these models with diverse human values, ethical standards, and specific user preferences. Direct Preference Optimization (DPO) has become a cornerstone for preference alignment but is constrained by reliance on fixed divergence measures and limited feature transformations. We introduce DPO-Kernels, an innovative enhancement of DPO that integrates kernel methods to overcome these challenges through four key contributions: (i) Kernelized Representations: These representations enhance divergence measures by using polynomial, RBF, Mahalanobis, and spectral kernels for richer feature transformations. Additionally, we introduce a hybrid loss that combines embedding-based loss with probability-based loss; (ii) Divergence Alternatives: Beyond Kullback–Leibler (KL), we incorporate Jensen-Shannon, Hellinger, Rényi, Bhattacharyya, Wasserstein, and other f-divergences to boost stability and robustness; (iii) Data-Driven Selection: Choosing the optimal kernel-divergence pair among 28 combinations (4 kernels × 7 divergences) is challenging. We introduce automatic metrics that analyze the data to select the best kernel-divergence pair, eliminating the need for manual tuning; (iv) Hierarchical Mixture of Kernels (HMK): Combining local and global kernels for precise and large-scale semantic modeling. This approach automatically selects the optimal kernel mixture during training, enhancing modeling flexibility. DPO-Kernels achieve state-of-the-art generalization in factuality, safety, reasoning, and instruction following across 12 datasets. While alignment risks overfitting, Heavy-Tailed Self-Regularization (HT-SR) theory confirms that DPO-Kernels ensure robust generalization in LLMs. Comprehensive resources are available to facilitate further research and application of DPO-Kernels.

YinYang-Align: A new Benchmark for Competing Objectives and Introducing Multi-Objective Preference based Text-to-Image Alignment
Amitava Das | Yaswanth Narsupalli | Gurpreet Singh | Vinija Jain | Vasu Sharma | Suranjana Trivedy | Aman Chadha | Amit Sheth
Findings of the Association for Computational Linguistics: ACL 2025

Precise alignment in Text-to-Image (T2I) systems is crucial for generating visuals that reflect user intent while adhering to ethical and policy standards. Recent controversies, such as the Google Gemini-generated Pope image backlash, highlight the urgent need for robust alignment mechanisms. Building on alignment successes in Large Language Models (LLMs), this paper introduces YinYangAlign, a benchmarking framework designed to evaluate and optimize T2I systems across six inherently contradictory objectives. These objectives highlight core trade-offs, such as balancing faithfulness to prompts with artistic freedom and maintaining cultural sensitivity without compromising creativity. Alongside this benchmark, we propose the Contradictory Alignment Optimization (CAO) framework, an extension of Direct Preference Optimization (DPO), which employs multi-objective optimization techniques to address these competing goals. By leveraging per-axiom loss functions, synergy-driven global preferences, and innovative tools like the Synergy Jacobian, CAO achieves superior alignment across all objectives. Experimental results demonstrate significant improvements in fidelity, diversity, and ethical adherence, setting new benchmarks for the field. This work provides a scalable, effective approach to resolving alignment challenges in T2I systems while offering insights into broader AI alignment paradigms.

The rapid progress and widespread availability of text-to-image (T2I) generation models have heightened concerns about the misuse of AI-generated visuals, particularly in the context of misinformation campaigns. Existing AI-generated image detection (AGID) methods often overfit to known generators and falter on outputs from newer or unseen models. To systematically address this generalization gap, we introduce the Visual Counter Turing Test (VCT^2), a comprehensive benchmark of 166,000 images, comprising both real and synthetic prompt-image pairs produced by six state-of-the-art (SoTA) T2I systems: Stable Diffusion 2.1, SDXL, SD3 Medium, SD3.5 Large, DALL·E 3, and Midjourney 6. We curate two distinct subsets: COCO_AI, featuring structured captions from MS COCO, and Twitter_AI, containing narrative-style tweets from The New York Times. Under a unified zero-shot evaluation, we benchmark 17 leading AGID models and observe alarmingly low detection accuracy, 58% on COCO_AI and 58.34% on Twitter_AI. To transcend binary classification, we propose the Visual AI Index (V_AI), an interpretable, prompt-agnostic realism metric based on twelve low-level visual features, enabling us to quantify and rank the perceptual quality of generated outputs with greater nuance. Correlation analysis reveals a moderate inverse relationship between V_AI and detection accuracy: Pearson rho of -0.532 on COCO_AI and rho of -0.503 on Twitter_AI; suggesting that more visually realistic images tend to be harder to detect, a trend observed consistently across generators. We release COCO_AI and Twitter_AI to catalyze future advances in robust AGID and perceptual realism assessment.

ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
Vipula Rawte | Sarthak Jain | Aarush Sinha | Garv Kaushik | Aman Bansal | Prathiksha Rumale Vishwanath | Samyak Rajesh Jain | Aishwarya Naresh Reganti | Vinija Jain | Aman Chadha | Amit Sheth | Amitava Das
Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025)

Recent advances in Large Multimodal Models (LMMs) have expanded their capabilities to video understanding, with Text-to-Video (T2V) models excelling in generating videos from textual prompts. However, they still frequently produce hallucinated content, revealing AI-generated inconsistencies. We introduce ViBe https://huggingface.co/datasets/ViBe-T2V-Bench/ViBe: a large-scale dataset of hallucinated videos from open-source T2V models. We identify five major hallucination types: Vanishing Subject, Omission Error, Numeric Variability, Subject Dysmorphia, and Visual Incongruity. Using ten T2V models, we generated and manually annotated 3,782 videos from 837 diverse MS COCO captions. Our proposed benchmark includes a dataset of hallucinated videos and a classification framework using video embeddings. ViBe serves as a critical resource for evaluating T2V reliability and advancing hallucination detection. We establish classification as a baseline, with the TimeSFormer + CNN ensemble achieving the best performance (0.345 accuracy, 0.342 F1 score). While initial baselines proposed achieve modest accuracy, this highlights the difficulty of automated hallucination detection and the need for improved methods. Our research aims to drive the development of more robust T2V models and evaluate their outputs based on user preferences. Our code is available at: https://anonymous.4open.science/r/vibe-1840/

Defining and Quantifying Visual Hallucinations in Vision-Language Models
Vipula Rawte | Aryan Mishra | Amit Sheth | Amitava Das
Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025)

The troubling rise of hallucination presents perhaps the most significant impediment to the advancement of responsible AI. In recent times, considerable research has focused on detecting and mitigating hallucination in Large Language Models (LLMs). However, it’s worth noting that hallucination is also quite prevalent in Vision-Language models (VLMs). In this paper, we offer a fine-grained discourse on profiling VLM hallucination based on the image captioning task. We delineate eight fine-grained orientations of visual hallucination: i) Contextual Guessing, ii) Identity Incongruity, iii) Geographical Erratum, iv) Visual Illusion, v) Gender Anomaly, vi) VLM as Classifier, vii) Wrong Reading, and viii) Numeric Discrepancy. We curate Visual HallucInation eLiciTation, a publicly available dataset comprising 2,000 samples generated using eight VLMs across the image captioning task, along with human annotations for the categories as mentioned earlier. To establish a method for quantification and to offer a comparative framework enabling the evaluation and ranking of VLMs according to their vulnerability to producing hallucinations, we propose the Visual Hallucination Vulnerability Index (VHVI). In summary, we introduce the VHILT dataset for image-to-text hallucinations and propose the VHVI metric to quantify hallucinations in VLMs, targeting specific visual hallucination types. A subset sample is available at: https://huggingface.co/datasets/vr25/vhil. The full dataset will be publicly released upon acceptance.

FACTOID: FACtual enTailment fOr hallucInation Detection
Vipula Rawte | S.m Towhidul Islam Tonmoy | Shravani Nag | Aman Chadha | Amit Sheth | Amitava Das
Proceedings of the 5th Workshop on Trustworthy NLP (TrustNLP 2025)

2024

Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs
Ronit Singal | Pransh Patwa | Parth Patwa | Aman Chadha | Amitava Das
Proceedings of the Seventh Fact Extraction and VERification Workshop (FEVER)

Given the widespread dissemination of misinformation on social media, implementing fact-checking mechanisms for online claims is essential. Manually verifying every claim is very challenging, underscoring the need for an automated fact-checking system. This paper presents our system designed to address this issue. We utilize the Averitec dataset (Schlichtkrull et al., 2023) to assess the performance of our fact-checking system. In addition to veracity prediction, our system provides supporting evidence, which is extracted from the dataset. We develop a Retrieve and Generate (RAG) pipeline to extract relevant evidence sentences from a knowledge base, which are then inputted along with the claim into a large language model (LLM) for classification. We also evaluate the few-shot In-Context Learning (ICL) capabilities of multiple LLMs. Our system achieves an ‘Averitec’ score of 0.33, which is a 22% absolute improvement over the baseline. Our Code is publicly available on https://github.com/ronit-singhal/evidence-backed-fact-checking-using-rag-and-few-shot-in-context-learning-with-llms.

On the Relationship between Sentence Analogy Identification and Sentence Structure Encoding in Large Language Models
Thilini Wijesiriwardene | Ruwan Wickramarachchi | Aishwarya Naresh Reganti | Vinija Jain | Aman Chadha | Amit Sheth | Amitava Das
Findings of the Association for Computational Linguistics: EACL 2024

The ability of Large Language Models (LLMs) to encode syntactic and semantic structures of language is well examined in NLP. Additionally, analogy identification, in the form of word analogies are extensively studied in the last decade of language modeling literature. In this work we specifically look at how LLMs’ abilities to capture sentence analogies (sentences that convey analogous meaning to each other) vary with LLMs’ abilities to encode syntactic and semantic structures of sentences. Through our analysis, we find that LLMs’ ability to identify sentence analogies is positively correlated with their ability to encode syntactic and semantic structures of sentences. Specifically, we find that the LLMs which capture syntactic structures better, also have higher abilities in identifying sentence analogies.

Counter Turing Test (CT²): Investigating AI-Generated Text Detection for Hindi - Ranking LLMs based on Hindi AI Detectability Index (ADI_hi)
Ishan Kavathekar | Anku Rani | Ashmit Chamoli | Ponnurangam Kumaraguru | Amit P. Sheth | Amitava Das
Findings of the Association for Computational Linguistics: EMNLP 2024

MULTILATE: A Synthetic Dataset on AI-Generated MULTImodaL hATE Speech
Advaitha Vetagiri | Eisha Halder | Ayanangshu Das Majumder | Partha Pakray | Amitava Das
Proceedings of the 21st International Conference on Natural Language Processing (ICON)

One of the pressing challenges society faces today is the rapid proliferation of online hate speech, exacerbated by the rise of AI-generated multimodal hate content. This new form of synthetically produced hate speech presents unprecedented challenges in detection and moderation. In response to the growing presence of such harmful content across social media platforms, this research introduces a groundbreaking solution:

Tutorial Proposal: Hallucination in Large Language Models
Vipula Rawte | Aman Chadha | Amit Sheth | Amitava Das
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024): Tutorial Summaries

In the fast-paced domain of Large Language Models (LLMs), the issue of hallucination is a prominent challenge. Despite continuous endeavors to address this concern, it remains a highly active area of research within the LLM landscape. Grasping the intricacies of this problem can be daunting, especially for those new to the field. This tutorial aims to bridge this knowledge gap by introducing the emerging realm of hallucination in LLMs. It will comprehensively explore the key aspects of hallucination, including benchmarking, detection, and mitigation techniques. Furthermore, we will delve into the specific constraints and shortcomings of current approaches, providing valuable insights to guide future research efforts for participants.

2023

FACTIFY-5WQA: 5W Aspect-based Fact Verification through Question Answering
Anku Rani | S.M Towhidul Islam Tonmoy | Dwip Dalal | Shreya Gautam | Megha Chakraborty | Aman Chadha | Amit Sheth | Amitava Das
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Automatic fact verification has received significant attention recently. Contemporary automatic fact-checking systems focus on estimating truthfulness using numerical scores which are not human-interpretable. A human fact-checker generally follows several logical steps to verify a verisimilitude claim and conclude whether it’s truthful or a mere masquerade. Popular fact-checking websites follow a common structure for fact categorization such as half true, half false, false, pants on fire, etc. Therefore, it is necessary to have an aspect-based (delineating which part(s) are true and which are false) explainable system that can assist human fact-checkers in asking relevant questions related to a fact, which can then be validated separately to reach a final verdict. In this paper, we propose a 5W framework (who, what, when, where, and why) for question-answer-based fact explainability. To that end, we present a semi-automatically generated dataset called FACTIFY-5WQA, which consists of 391, 041 facts along with relevant 5W QAs – underscoring our major contribution to this paper. A semantic role labeling system has been utilized to locate 5Ws, which generates QA pairs for claims using a masked language model. Finally, we report a baseline QA system to automatically locate those answers from evidence documents, which can serve as a baseline for future research in the field. Lastly, we propose a robust fact verification system that takes paraphrased claims and automatically validates them. The dataset and the baseline model are available at https: //github.com/ankuranii/acl-5W-QA

CONFLATOR: Incorporating Switching Point based Rotatory Positional Encodings for Code-Mixed Language Modeling
Mohsin Mohammed | Sai Kandukuri | Neeharika Gupta | Parth Patwa | Anubhab Chatterjee | Vinija Jain | Aman Chadha | Amitava Das
Proceedings of the 6th Workshop on Computational Approaches to Linguistic Code-Switching

The mixing of two or more languages is called Code-Mixing (CM). CM is a social norm in multilingual societies. Neural Language Models (NLMs) like transformers have been effective on many NLP tasks. However, NLM for CM is an under-explored area. Though transformers are capable and powerful, they cannot always encode positional information since they are non-recurrent. Therefore, to enrich word information and incorporate positional information, positional encoding is defined. We hypothesize that Switching Points (SPs), i.e., junctions in the text where the language switches (L1 -> L2 or L2 -> L1), pose a challenge for CM Language Models (LMs), and hence give special emphasis to SPs in the modeling process. We experiment with several positional encoding mechanisms and show that rotatory positional encodings along with switching point information yield the best results.We introduce CONFLATOR: a neural language modeling approach for code-mixed languages. CONFLATOR tries to learn to emphasize switching points using smarter positional encoding, both at unigram and bigram levels. CONFLATOR outperforms the state-of-the-art on two tasks based on code-mixed Hindi and English (Hinglish): (i) sentiment analysis and (ii) machine translation.

Counter Turing Test (CT2): AI-Generated Text Detection is Not as Easy as You May Think - Introducing AI Detectability Index (ADI)
Megha Chakraborty | S.M Towhidul Islam Tonmoy | S M Mehedi Zaman | Shreya Gautam | Tanay Kumar | Krish Sharma | Niyar Barman | Chandan Gupta | Vinija Jain | Aman Chadha | Amit Sheth | Amitava Das
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

With the rise of prolific ChatGPT, the risk and consequences of AI-generated text has increased alarmingly. This triggered a series of events, including an open letter, signed by thousands of researchers and tech leaders in March 2023, demanding a six-month moratorium on the training of AI systems more sophisticated than GPT-4. To address the inevitable question of ownership attribution for AI-generated artifacts, the US Copyright Office released a statement stating that “if the content is traditional elements of authorship produced by a machine, the work lacks human authorship and the office will not register it for copyright”. Furthermore, both the US and the EU governments have recently drafted their initial proposals regarding the regulatory framework for AI. Given this cynosural spotlight on generative AI, AI-generated text detection (AGTD) has emerged as a topic that has already received immediate attention in research, with some initial methods having been proposed, soon followed by the emergence of techniques to bypass detection. This paper introduces the Counter Turing Test (CT2), a benchmark consisting of techniques aiming to offer a comprehensive evaluation of the robustness of existing AGTD techniques. Our empirical findings unequivocally highlight the fragility of the proposed AGTD methods under scrutiny. Amidst the extensive deliberations on policy-making for regulating AI development, it is of utmost importance to assess the detectability of content generated by LLMs. Thus, to establish a quantifiable spectrum facilitating the evaluation and ranking of LLMs according to their detectability levels, we propose the AI Detectability Index (ADI). We conduct a thorough examination of 15 contemporary LLMs, empirically demonstrating that larger LLMs tend to have a lower ADI, indicating they are less detectable compared to smaller LLMs. We firmly believe that ADI holds significant value as a tool for the wider NLP community, with the potential to serve as a rubric in AI-related policy-making.

The Troubling Emergence of Hallucination in Large Language Models - An Extensive Definition, Quantification, and Prescriptive Remediations
Vipula Rawte | Swagata Chakraborty | Agnibh Pathak | Anubhav Sarkar | S.M Towhidul Islam Tonmoy | Aman Chadha | Amit Sheth | Amitava Das
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The recent advancements in Large Language Models (LLMs) have garnered widespread acclaim for their remarkable emerging capabilities. However, the issue of hallucination has parallelly emerged as a by-product, posing significant concerns. While some recent endeavors have been made to identify and mitigate different types of hallucination, there has been a limited emphasis on the nuanced categorization of hallucination and associated mitigation methods. To address this gap, we offer a fine-grained discourse on profiling hallucination based on its degree, orientation, and category, along with offering strategies for alleviation. As such, we define two overarching orientations of hallucination: (i) factual mirage (FM) and (ii) silver lining (SL). To provide a more comprehensive understanding, both orientations are further sub-categorized into intrinsic and extrinsic, with three degrees of severity - (i) mild, (ii) moderate, and (iii) alarming. We also meticulously categorize hallucination into six types: (i) acronym ambiguity, (ii) numeric nuisance, (iii) generated golem, (iv) virtual voice, (v) geographic erratum, and (vi) time wrap. Furthermore, we curate HallucInation eLiciTation (HILT), a publicly available dataset comprising of 75,000 samples generated using 15 contemporary LLMs along with human annotations for the aforementioned categories. Finally, to establish a method for quantifying and to offer a comparative spectrum that allows us to evaluate and rank LLMs based on their vulnerability to producing hallucinations, we propose Hallucination Vulnerability Index (HVI). Amidst the extensive deliberations on policy-making for regulating AI development, it is of utmost importance to assess and measure which LLM is more vulnerable towards hallucination. We firmly believe that HVI holds significant value as a tool for the wider NLP community, with the potential to serve as a rubric in AI-related policy-making. In conclusion, we propose two solution strategies for mitigating hallucinations.

Combating disinformation is one of the burning societal crises - about 67% of the American population believes that disinformation produces a lot of uncertainty, and 10% of them knowingly propagate disinformation. Evidence shows that disinformation can manipulate democratic processes and public opinion, causing disruption in the share market, panic and anxiety in society, and even death during crises. Therefore, disinformation should be identified promptly and, if possible, mitigated. With approximately 3.2 billion images and 720,000 hours of video shared online daily on social media platforms, scalable detection of multimodal disinformation requires efficient fact verification. Despite progress in automatic text-based fact verification (e.g., FEVER, LIAR), the research community lacks substantial effort in multimodal fact verification. To address this gap, we introduce FACTIFY 3M, a dataset of 3 million samples that pushes the boundaries of the domain of fact verification via a multimodal fake news dataset, in addition to offering explainability through the concept of 5W question-answering. Salient features of the dataset include: (i) textual claims, (ii) ChatGPT-generated paraphrased claims, (iii) associated images, (iv) stable diffusion-generated additional images (i.e., visual paraphrases), (v) pixel-level image heatmap to foster image-text explainability of the claim, (vi) 5W QA pairs, and (vii) adversarial fake news stories.

ANALOGICAL - A Novel Benchmark for Long Text Analogy Evaluation in Large Language Models
Thilini Wijesiriwardene | Ruwan Wickramarachchi | Bimal Gajera | Shreeyash Gowaikar | Chandan Gupta | Aman Chadha | Aishwarya Naresh Reganti | Amit Sheth | Amitava Das
Findings of the Association for Computational Linguistics: ACL 2023

Over the past decade, analogies, in the form of word-level analogies, have played a significant role as an intrinsic measure of evaluating the quality of word embedding methods such as word2vec. Modern large language models (LLMs), however, are primarily evaluated on extrinsic measures based on benchmarks such as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs can draw analogies between long texts. In this paper, we present ANALOGICAL, a new benchmark to intrinsically evaluate LLMs across a taxonomy of analogies of long text with six levels of complexity – (i) word, (ii) word vs. sentence, (iii) syntactic, (iv) negation, (v) entailment, and (vi) metaphor. Using thirteen datasets and three different distance measures, we evaluate the abilities of eight LLMs in identifying analogical pairs in the semantic vector space. Our evaluation finds that it is increasingly challenging for LLMs to identify analogies when going up the analogy taxonomy.

IMAGINATOR: Pre-Trained Image+Text Joint Embeddings using Word-Level Grounding of Images
Varuna Krishna Kolla | Suryavardan Suresh | Shreyash Mishra | Sathyanarayanan Ramamoorthy | Parth Patwa | Megha Chakraborty | Aman Chadha | Amitava Das | Amit Sheth
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

Word embeddings, i.e., semantically meaningful vector representation of words, are largely influenced by the distributional hypothesis You shall know a word by the company it keeps (Harris, 1954), whereas modern prediction- based neural network embeddings rely on de- sign choices and hyperparameter optimization. Word embeddings like Word2Vec, GloVe etc. well capture the contextuality and real-world analogies but contemporary convolution-based image embeddings such as VGGNet, AlexNet, etc. do not capture contextual knowledge. The popular king-queen analogy does not hold true for most commonly used vision embeddings. In this paper, we introduce a pre-trained joint embedding (JE), named IMAGINATOR, trained on 21K distinct image objects. JE is a way to encode multimodal data into a vec- tor space where the text modality serves as the grounding key, which the complementary modality (in this case, the image) is anchored with. IMAGINATOR encapsulates three in- dividual representations: (i) object-object co- location, (ii) word-object co-location, and (iii) word-object correlation. These three ways cap- ture complementary aspects of the two modal- ities which are further combined to obtain the final object-word JEs. Generated JEs are intrinsically evaluated to assess how well they capture the contextual- ity and real-world analogies. We also evalu- ate pre-trained IMAGINATOR JEs on three downstream tasks: (i) image captioning, (ii) Im- age2Tweet, and (iii) text-based image retrieval. IMAGINATOR establishes a new standard on the aforementioned downstream tasks by out- performing the current SoTA on all the selected tasks. The code is available at https:// github.com/varunakk/IMAGINATOR.

CNLP-NITS at SemEval-2023 Task 10: Online sexism prediction, PREDHATE!
Advaitha Vetagiri | Prottay Adhikary | Partha Pakray | Amitava Das
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

Online sexism is a rising issue that threatens women’s safety, fosters hostile situations, and upholds social inequities. We describe a task SemEval-2023 Task 10 for creating English-language models that can precisely identify and categorize sexist content on internet forums and social platforms like Gab and Reddit as well to provide an explainability in order to address this problem. The problem is divided into three hierarchically organized subtasks: binary sexism detection, sexism by category, and sexism by fine-grained vector. The dataset consists of 20,000 labelled entries. For Task A, pertained models like Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM), which is called CNN-BiLSTM and Generative Pretrained Transformer 2 (GPT-2) models were used, as well as the GPT-2 model for Task B and C, and have provided experimental configurations. According to our findings, the GPT-2 model performs better than the CNN-BiLSTM model for Task A, while GPT-2 is highly accurate for Tasks B and C on the training, validation and testing splits of the training data provided in the task. Our proposed models allow researchers to create more precise and understandable models for identifying and categorizing sexist content in online forums, thereby empowering users and moderators.

2022

A Novel Approach towards Cross Lingual Sentiment Analysis using Transliteration and Character Embedding
Rajarshi Roychoudhury | Subhrajit Dey | Md Shad Akhtar | Amitava Das | Sudip Kumar Naskar
Proceedings of the 19th International Conference on Natural Language Processing (ICON)

Sentiment analysis with deep learning in resource-constrained languages is a challenging task. In this paper, we introduce a novel approach for sentiment analysis in resource-constrained scenarios using character embedding and cross-lingual sentiment analysis with transliteration. We use this method to introduce the novel task of inducing sentiment polarity of words and sentences and aspect term sentiment analysis in the no-resource scenario. We formulate this task by taking a metalingual approach whereby we transliterate data from closely related languages and transform it into a meta language. We also demonstrated the efficacy of using character-level embedding for sentence representation. We experimented with 4 Indian languages – Bengali, Hindi, Tamil, and Telugu, and obtained encouraging results. We also presented new state-of-the-art results on the Hindi sentiment analysis dataset leveraging our metalingual character embeddings.

2021

Co-attention based Multimodal Factorized Bilinear Pooling for Internet Memes Analysis
Gitanjali Kumari | Amitava Das | Asif Ekbal
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

Social media platforms like Facebook, Twitter, and Instagram have a significant impact on several aspects of society. Memes are a new type of social media communication found on social platforms. Even though memes are primarily used to distribute humorous content, certain memes propagate hate speech through dark humor. It is critical to properly analyze and filter out these toxic memes from social media. But the presence of sarcasm and humor in an implicit way analyzes memes more challenging. This paper proposes an end-to-end neural network architecture that learns the complex association between text and image of a meme. For this purpose, we use a recent SemEval-2020 Task-8 multimodal dataset. We proposed an end-to-end CNN-based deep neural network architecture with two sub-modules viz. (i)Co-attention based sub-module and (ii) Multimodal Factorized Bilinear Pooling(MFB) sub-module to represent the textual and visual features of a meme in a more fine-grained way. We demonstrated the effectiveness of our proposed work through extensive experiments. The experimental results show that our proposed model achieves a 36.81% macro F1-score, outperforming all the baseline models.

Image2tweet: Datasets in Hindi and English for Generating Tweets from Images
Rishabh Jha | Varshith Kaki | Varuna Kolla | Shubham Bhagat | Parth Patwa | Amitava Das | Santanu Pal
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

Image Captioning as a task that has seen major updates over time. In recent methods, visual-linguistic grounding of the image-text pair is leveraged. This includes either generating the textual description of the objects and entities present within the image in constrained manner, or generating detailed description of these entities as a paragraph. But there is still a long way to go towards being able to generate text that is not only semantically richer, but also contains real world knowledge in it. This is the motivation behind exploring image2tweet generation through the lens of existing image-captioning approaches. At the same time, there is little research in image captioning in Indian languages like Hindi. In this paper, we release Hindi and English datasets for the task of tweet generation given an image. The aim is to generate a specialized text like a tweet, that is not a direct result of visual-linguistic grounding that is usually leveraged in similar tasks, but conveys a message that factors-in not only the visual content of the image, but also additional real world contextual information associated with the event described within the image as closely as possible. Further, We provide baseline DL models on our data and invite researchers to build more sophisticated systems for the problem.

2020

Proceedings of the 4th Workshop on Computational Approaches to Code Switching
Thamar Solorio | Monojit Choudhury | Kalika Bali | Sunayana Sitaram | Amitava Das | Mona Diab
Proceedings of the 4th Workshop on Computational Approaches to Code Switching

Hater-O-Genius Aggression Classification using Capsule Networks
Parth Patwa | Srinivas Pykl | Amitava Das | Prerana Mukherjee | Viswanath Pulabaigari
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

Contending hate speech in social media is one of the most challenging social problems of our time. There are various types of anti-social behavior in social media. Foremost of them is aggressive behavior, which is causing many social issues such as affecting the social lives and mental health of social media users. In this paper, we propose an end-to-end ensemble-based architecture to automatically identify and classify aggressive tweets. Tweets are classified into three categories - Covertly Aggressive, Overtly Aggressive, and Non-Aggressive. The proposed architecture is an ensemble of smaller subnetworks that are able to characterize the feature embeddings effectively. We demonstrate qualitatively that each of the smaller subnetworks is able to learn unique features. Our best model is an ensemble of Capsule Networks and results in a 65.2% F1 score on the Facebook test set, which results in a performance gain of 0.95% over the TRAC-2018 winners. The code and the model weights are publicly available at https://github.com/parthpatwa/Hater-O-Genius-Aggression-Classification-using-Capsule-Networks.

Minority Positive Sampling for Switching Points - an Anecdote for the Code-Mixing Language Modeling
Arindam Chatterjere | Vineeth Guptha | Parul Chopra | Amitava Das
Proceedings of the Twelfth Language Resources and Evaluation Conference

Code-Mixing (CM) or language mixing is a social norm in multilingual societies. CM is quite prevalent in social media conversations in multilingual regions like - India, Europe, Canada and Mexico. In this paper, we explore the problem of Language Modeling (LM) for code-mixed Hinglish text. In recent times, there have been several success stories with neural language modeling like Generative Pre-trained Transformer (GPT) (Radford et al., 2019), Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al., 2018) etc.. Hence, neural language models have become the new holy grail of modern NLP, although LM for CM is an unexplored area altogether. To better understand the problem of LM for CM, we initially experimented with several statistical language modeling techniques and consequently experimented with contemporary neural language models. Analysis shows switching-points are the main challenge for the LMCM performance drop, therefore in this paper we introduce the idea of minority positive sampling to selectively induce more sample to achieve better performance. On the contrary, all neural language models demand a huge corpus to train on for better performance. Finally, we are reporting a perplexity of 139 for Hinglish (Hindi-English language pair) LMCM using statistical bi-directional techniques.

SemEval-2020 Task 8: Memotion Analysis- the Visuo-Lingual Metaphor!
Chhavi Sharma | Deepesh Bhageria | William Scott | Srinivas PYKL | Amitava Das | Tanmoy Chakraborty | Viswanath Pulabaigari | Björn Gambäck
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Information on social media comprises of various modalities such as textual, visual and audio. NLP and Computer Vision communities often leverage only one prominent modality in isolation to study social media. However, computational processing of Internet memes needs a hybrid approach. The growing ubiquity of Internet memes on social media platforms such as Facebook, Instagram, and Twitter further suggests that we can not ignore such multimodal content anymore. To the best of our knowledge, there is not much attention towards meme emotion analysis. The objective of this proposal is to bring the attention of the research community towards the automatic processing of Internet memes. The task Memotion analysis released approx 10K annotated memes- with human annotated labels namely sentiment(positive, negative, neutral), type of emotion(sarcastic,funny,offensive, motivation) and their corresponding intensity. The challenge consisted of three subtasks: sentiment (positive, negative, and neutral) analysis of memes,overall emotion (humor, sarcasm, offensive, and motivational) classification of memes, and classifying intensity of meme emotion. The best performances achieved were F1 (macro average) scores of 0.35, 0.51 and 0.32, respectively for each of the three subtasks.

SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets
Parth Patwa | Gustavo Aguilar | Sudipta Kar | Suraj Pandey | Srinivas PYKL | Björn Gambäck | Tanmoy Chakraborty | Thamar Solorio | Amitava Das
Proceedings of the Fourteenth Workshop on Semantic Evaluation

In this paper, we present the results of the SemEval-2020 Task 9 on Sentiment Analysis of Code-Mixed Tweets (SentiMix 2020). We also release and describe our Hinglish (Hindi-English)and Spanglish (Spanish-English) corpora annotated with word-level language identification and sentence-level sentiment labels. These corpora are comprised of 20K and 19K examples, respectively. The sentiment labels are - Positive, Negative, and Neutral. SentiMix attracted 89 submissions in total including 61 teams that participated in the Hinglish contest and 28 submitted systems to the Spanglish competition. The best performance achieved was 75.0% F1 score for Hinglish and 80.6% F1 for Spanglish. We observe that BERT-like models and ensemble methods are the most common and successful approaches among the participants.

Aggression and Misogyny Detection using BERT: A Multi-Task Approach
Niloofar Safi Samghabadi | Parth Patwa | Srinivas PYKL | Prerana Mukherjee | Amitava Das | Thamar Solorio
Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying

In recent times, the focus of the NLP community has increased towards offensive language, aggression, and hate-speech detection. This paper presents our system for TRAC-2 shared task on “Aggression Identification” (sub-task A) and “Misogynistic Aggression Identification” (sub-task B). The data for this shared task is provided in three different languages - English, Hindi, and Bengali. Each data instance is annotated into one of the three aggression classes - Not Aggressive, Covertly Aggressive, Overtly Aggressive, as well as one of the two misogyny classes - Gendered and Non-Gendered. We propose an end-to-end neural model using attention on top of BERT that incorporates a multi-task learning paradigm to address both the sub-tasks simultaneously. Our team, “na14”, scored 0.8579 weighted F1-measure on the English sub-task B and secured 3rd rank out of 15 teams for the task. The code and the model weights are publicly available at https://github.com/NiloofarSafi/TRAC-2. Keywords: Aggression, Misogyny, Abusive Language, Hate-Speech Detection, BERT, NLP, Neural Networks, Social Media

2019

NIT_Agartala_NLP_Team at SemEval-2019 Task 6: An Ensemble Approach to Identifying and Categorizing Offensive Language in Twitter Social Media Corpora
Steve Durairaj Swamy | Anupam Jamatia | Björn Gambäck | Amitava Das
Proceedings of the 13th International Workshop on Semantic Evaluation

The paper describes the systems submitted to OffensEval (SemEval 2019, Task 6) on ‘Identifying and Categorizing Offensive Language in Social Media’ by the ‘NIT_Agartala_NLP_Team’. A Twitter annotated dataset of 13,240 English tweets was provided by the task organizers to train the individual models, with the best results obtained using an ensemble model composed of six different classifiers. The ensemble model produced macro-averaged F1-scores of 0.7434, 0.7078 and 0.4853 on Subtasks A, B, and C, respectively. The paper highlights the overall low predictive nature of various linguistic features and surface level count features, as well as the limitations of a traditional machine learning approach when compared to a Deep Learning counterpart.

2017

A Societal Sentiment Analysis: Predicting the Values and Ethics of Individuals by Analysing Social Media Content
Tushar Maheshwari | Aishwarya N. Reganti | Samiksha Gupta | Anupam Jamatia | Upendra Kumar | Björn Gambäck | Amitava Das
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

To find out how users’ social media behaviour and language are related to their ethical practices, the paper investigates applying Schwartz’ psycholinguistic model of societal sentiment to social media text. The analysis is based on corpora collected from user essays as well as social media (Facebook and Twitter). Several experiments were carried out on the corpora to classify the ethical values of users, incorporating Linguistic Inquiry Word Count analysis, n-grams, topic models, psycholinguistic lexica, speech-acts, and non-linguistic information, while applying a range of machine learners (Support Vector Machines, Logistic Regression, and Random Forests) to identify the best linguistic and non-linguistic features for automatic classification of values and ethics.

Measuring the Limit of Semantic Divergence for English Tweets.
Dwijen Rudrapal | Amitava Das
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

In human language, an expression could be conveyed in many ways by different people. Even that the same person may express same sentence quite differently when addressing different audiences, using different modalities, or using different syntactic variations or may use different set of vocabulary. The possibility of such endless surface form of text while the meaning of the text remains almost same, poses many challenges for Natural Language Processing (NLP) systems like question-answering system, machine translation system and text summarization. This research paper is an endeavor to understand the characteristic of such endless semantic divergence. In this research work we develop a corpus of 1525 semantic divergent sentences for 200 English tweets.

“A pessimist sees the difficulty in every opportunity; an optimist sees the opportunity in every difficulty” – Understanding the psycho-sociological influences to it
Updendra Kumar | Vishal Kumar Rana | Srinivas PYKL | Amitava Das
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

“Who Mentions Whom?”- Understanding the Psycho-Sociological Aspects of Twitter Mention Network
R Sudhesh Solomon | Abhay Narayan | Srinivas P Y K L | Amitava Das
Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017)

2016

Comparing the Level of Code-Switching in Corpora
Björn Gambäck | Amitava Das
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Social media texts are often fairly informal and conversational, and when produced by bilinguals tend to be written in several different languages simultaneously, in the same way as conversational speech. The recent availability of large social media corpora has thus also made large-scale code-switched resources available for research. The paper addresses the issues of evaluation and comparison these new corpora entail, by defining an objective measure of corpus level complexity of code-switched texts. It is also shown how this formal measure can be used in practice, by applying it to several code-switched corpora.

Cosmopolitan Mumbai, Orthodox Delhi, Techcity Bangalore:Understanding City Specific Societal Sentiment
Aishwarya N Reganti | Tushar Maheshwari | Upendra Kumar | Amitava Das
Proceedings of the 13th International Conference on Natural Language Processing

2015

Part-of-Speech Tagging for Code-Mixed English-Hindi Twitter and Facebook Chat Messages
Anupam Jamatia | Björn Gambäck | Amitava Das
Proceedings of the International Conference Recent Advances in Natural Language Processing

Measuring Semantic Similarity for Bengali Tweets Using WordNet
Dwijen Rudrapal | Amitava Das | Baby Bhattacharya
Proceedings of the International Conference Recent Advances in Natural Language Processing

Sentence Boundary Detection for Social Media Text
Dwijen Rudrapal | Anupam Jamatia | Kunal Chakma | Amitava Das | Björn Gambäck
Proceedings of the 12th International Conference on Natural Language Processing

2014

Code Mixing: A Challenge for Language Identification in the Language of Social Media
Utsab Barman | Amitava Das | Joachim Wagner | Jennifer Foster
Proceedings of the First Workshop on Computational Approaches to Code Switching

A Framework for Health Behavior Change using Companionable Robots
Bandita Sarma | Amitava Das | Rodney Nielsen
Proceedings of the 8th International Natural Language Generation Conference (INLG)

Identifying Languages at the Word Level in Code-Mixed Indian Social Media Text
Amitava Das | Björn Gambäck
Proceedings of the 11th International Conference on Natural Language Processing

2013

Code-Mixing in Social Media Text
Amitava Das | Björn Gambäck
Traitement Automatique des Langues, Volume 54, Numéro 3 : Traitement automatique du langage naturel pour l'analyse des réseaux sociaux (TAL et réseaux sociaux) [Social Networks and NLP]

2012

A Light Weight Stemmer in Kokborok
Braja Gopal Patra | Khumbar Debbarma | Swapan Debbarma | Dipankar Das | Amitava Das | Sivaji Bandyopadhyay
Proceedings of the 24th Conference on Computational Linguistics and Speech Processing (ROCLING 2012)

Sentimantics: Conceptual Spaces for Lexical Sentiment Polarity Representation with Contextuality
Amitava Das | Björn Gambäck
Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis

2011

PsychoSentiWordNet
Amitava Das
Proceedings of the ACL 2011 Student Session

Dr Sentiment Knows Everything!
Amitava Das | Sivaji Bandyopadhyay
Proceedings of the ACL-HLT 2011 System Demonstrations

2010

Topic-Based Bengali Opinion Summarization
Amitava Das | Sivaji Bandyopadhyay
Coling 2010: Posters

English to Indian Languages Machine Transliteration System at NEWS 2010
Amitava Das | Tanik Saikh | Tapabrata Mondal | Asif Ekbal | Sivaji Bandyopadhyay
Proceedings of the 2010 Named Entities Workshop

SentiWordNet for Indian Languages
Amitava Das | Sivaji Bandyopadhyay
Proceedings of the Eighth Workshop on Asian Language Resouces

SemanticNet-Perception of Human Pragmatics
Amitava Das | Sivaji Bandyopadhyay
Proceedings of the 2nd Workshop on Cognitive Aspects of the Lexicon

Clause Identification and Classification in Bengali
Aniruddha Ghosh | Amitava Das | Sivaji Bandyopadhyay
Proceedings of the 1st Workshop on South and Southeast Asian Natural Language Processing

JU_CSE_GREC10: Named Entity Generation at GREC 2010
Amitava Das | Tanik Saikh | Tapabrata Mondal | Sivaji Bandyopadhyay
Proceedings of the 6th International Natural Language Generation Conference

Towards the Global SentiWordNet
Amitava Das | Sivaji Bandyopadhyay
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

2009

English to Hindi Machine Transliteration System at NEWS 2009
Amitava Das | Asif Ekbal | Tapabrata Mondal | Sivaji Bandyopadhyay
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)

2008

Language Independent Named Entity Recognition in Indian Languages
Asif Ekbal | Rejwanul Haque | Amitava Das | Venkateswarlu Poka | Sivaji Bandyopadhyay
Proceedings of the IJCNLP-08 Workshop on Named Entity Recognition for South and South East Asian Languages

Co-authors

Srinivas Pykl 5

Megha Chakraborty 4

Anupam Jamatia 4

Aishwarya Naresh Reganti 4

S.m Towhidul Islam Tonmoy 4

Shreya Gautam 3

Tapabrata Mondal 3

Dwijen Rudrapal 3

Gurpreet Singh 3

Thamar Solorio 3

Ruwan Wickramarachchi 3

Thilini Wijesiriwardene 3

Hasnat Md Abdullah 2

Abhilekh Borah 2

Tanmoy Chakraborty 2

Chandan Gupta 2

Danush Khanna 2

Upendra Kumar 2

Ponnurangam Kumaraguru 2

Tushar Maheshwari 2

Prerana Mukherjee 2

Yaswanth Narsupalli 2

Partha Pakray 2

Viswanath Pulabaigari 2

Aishwarya N. Reganti 2

Chhavi Sharma 2

Suranjana Trivedy 2

Advaitha Vetagiri 2

Prottay Adhikary 1

Gustavo Aguilar 1

Md. Shad Akhtar 1

Shashwat Bajpai 1

Shubham Bhagat 1

Deepesh Bhageria 1

Utkarsh Bhatt 1

Baby Bhattacharya 1

Shwetangshu Biswas 1

Swagata Chakraborty 1

Ashmit Chamoli 1

Anubhab Chatterjee 1

Shreyas Chatterjee 1

Arindam Chatterjere 1

Monojit Choudhury 1

Khumbar Debbarma 1

Swapan Debbarma 1

Subhrajit Dey 1

Shreyas Dixit 1

Jennifer Foster 1

Subhankar Ghosh 1

Aniruddha Ghosh 1

Shreeyash Gowaikar 1

Neeharika Gupta 1

Samiksha Gupta 1

Vineeth Guptha 1

Preethi Gurumurthy 1

Rejwanul Haque 1

Nasrin Imanpour 1

Samyak Rajesh Jain 1

Varshith Kaki 1

Sai Kandukuri 1

Ishan Kavathekar 1

Varuna Krishna Kolla 1

Nishoak Kosaraju 1

Updendra Kumar 1

Gitanjali Kumari 1

Srinivas P Y K L 1

Ayanangshu Das Majumder 1

Shreyash Mishra 1

Mohsin Mohammed 1

Samahriti Mukherjee 1

Abhay Narayan 1

Sudip Kumar Naskar 1

Rodney Nielsen 1

Khushbu Pahwa 1

Aditya Pakala 1

Agnibh Pathak 1

Braja Gopal Patra 1

Venkateswarlu Poka 1

Sathyanarayanan Ramamoorthy 1

Vishal Kumar Rana 1

Raghav Kaushik Ravi 1

Janvita Reddy 1

Rajarshi Roychoudhury 1

Niloofar Safi Samghabadi 1

Sainath Reddy Sankepally 1

Anubhav Sarkar 1

Arghya Sarkar 1

Bandita Sarma 1

William Scott 1

Kinjal Sensharma 1

Rushendra Sidibomma 1

Shubham Singh 1

Sunayana Sitaram 1

R Sudhesh Solomon 1

Suryavardan Suresh 1

Steve Durairaj Swamy 1

Sreeram Reddy Vennam 1

Prathiksha Rumale Vishwanath 1

Joachim Wagner 1

S M Mehedi Zaman 1

Venues