Shubhashis Sengupta


pdf bib
Document Retrieval and Claim Verification to Mitigate COVID-19 Misinformation
Megha Sundriyal | Ganeshan Malhotra | Md Shad Akhtar | Shubhashis Sengupta | Andrew Fano | Tanmoy Chakraborty
Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations

During the COVID-19 pandemic, the spread of misinformation on online social media has grown exponentially. Unverified bogus claims on these platforms regularly mislead people, leading them to believe in half-baked truths. The current vogue is to employ manual fact-checkers to verify claims to combat this avalanche of misinformation. However, establishing such claims’ veracity is becoming increasingly challenging, partly due to the plethora of information available, which is difficult to process manually. Thus, it becomes imperative to verify claims automatically without human interventions. To cope up with this issue, we propose an automated claim verification solution encompassing two steps – document retrieval and veracity prediction. For the retrieval module, we employ a hybrid search-based system with BM25 as a base retriever and experiment with recent state-of-the-art transformer-based models for re-ranking. Furthermore, we use a BART-based textual entailment architecture to authenticate the retrieved documents in the later step. We report experimental findings, demonstrating that our retrieval module outperforms the best baseline system by 10.32 NDCG@100 points. We escort a demonstration to assess the efficacy and impact of our suggested solution. As a byproduct of this study, we present an open-source, easily deployable, and user-friendly Python API that the community can adopt.


pdf bib
Unknown Intent Detection using Multi-Objective Optimization on Deep Learning Classifiers
Prerna Prem | Zishan Ahmad | Asif Ekbal | Shubhashis Sengupta | Sakshi Jain | Roshini Rammani
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

pdf bib
Unknown Intent Detection Using Multi-Objective Optimization on Deep Learning Classifiers
Prerna Prem | Zishan Ahmad | Asif Ekbal | Shubhashis Sengupta | Sakshi C. Jain | Roshni Ramnani
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Modelling and understanding dialogues in a conversation depends on identifying the user intent from the given text. Unknown or new intent detection is a critical task, as in a realistic scenario a user intent may frequently change over time and divert even to an intent previously not encountered. This task of separating the unknown intent samples from known intents one is challenging as the unknown user intent can range from intents similar to the predefined intents to something completely different. Prior research on intent discovery often consider it as a classification task where an unknown intent can belong to a predefined set of known intent classes. In this paper we tackle the problem of detecting a completely unknown intent without any prior hints about the kind of classes belonging to unknown intents. We propose an effective post-processing method using multi-objective optimization to tune an existing neural network based intent classifier and make it capable of detecting unknown intents. We perform experiments using existing state-of-the-art intent classifiers and use our method on top of them for unknown intent detection. Our experiments across different domains and real-world datasets show that our method yields significant improvements compared with the state-of-the-art methods for unknown intent detection.


pdf bib
Intent Mining from past conversations for Conversational Agent
Ajay Chatterjee | Shubhashis Sengupta
Proceedings of the 28th International Conference on Computational Linguistics

Conversational systems are of primary interest in the AI community. Organizations are increasingly using chatbot to provide round-the-clock support and to increase customer engagement. Many commercial bot building frameworks follow a standard approach that requires one to build and train an intent model to recognize user input. These frameworks require a collection of user utterances and corresponding intent to train an intent model. Collecting a substantial coverage of training data is a bottleneck in the bot building process. In cases where past conversation data is available, the cost of labeling hundreds of utterances with intent labels is time-consuming and laborious. In this paper, we present an intent discovery framework that can mine a vast amount of conversational logs and to generate labeled data sets for training intent models. We have introduced an extension to the DBSCAN algorithm and presented a density-based clustering algorithm ITER-DBSCAN for unbalanced data clustering. Empirical evaluation on one conversation dataset, six different intent dataset, and one short text clustering dataset show the effectiveness of our hypothesis.


pdf bib
A Sequence Modeling Approach for Structured Data Extraction from Unstructured Text
Jayati Deshmukh | Annervaz K M | Shubhashis Sengupta
Proceedings of the 5th Workshop on Semantic Deep Learning (SemDeep-5)


pdf bib
Can Taxonomy Help? Improving Semantic Question Matching using Question Taxonomy
Deepak Gupta | Rajkumar Pujari | Asif Ekbal | Pushpak Bhattacharyya | Anutosh Maitra | Tom Jain | Shubhashis Sengupta
Proceedings of the 27th International Conference on Computational Linguistics

In this paper, we propose a hybrid technique for semantic question matching. It uses a proposed two-layered taxonomy for English questions by augmenting state-of-the-art deep learning models with question classes obtained from a deep learning based question classifier. Experiments performed on three open-domain datasets demonstrate the effectiveness of our proposed approach. We achieve state-of-the-art results on partial ordering question ranking (POQR) benchmark dataset. Our empirical analysis shows that coupling standard distributional features (provided by the question encoder) with knowledge from taxonomy is more effective than either deep learning or taxonomy-based knowledge alone.