Promod Yenigalla


2024

pdf bib
MARS: Multilingual Aspect-centric Review Summarisation
Sandeep Sricharan Mukku | Abinesh Kanagarajan | Chetan Aggarwal | Promod Yenigalla
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

pdf bib
MARCO: Multi-Agent Real-time Chat Orchestration
Anubhav Shrimal | Stanley Kanagaraj | Kriti Biswas | Swarnalatha Raghuraman | Anish Nediyanchath | Yi Zhang | Promod Yenigalla
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

Large language model advancements have enabled the development of multi-agent frameworks to tackle complex, real-world problems such as to automate workflows that require interactions with diverse tools, reasoning, and human collaboration. We present MARCO, a Multi-Agent Real-time Chat Orchestration framework for automating workflows using LLMs. MARCO addresses key challenges in utilizing LLMs for complex, multi-step task execution in a production environment. It incorporates robust guardrails to steer LLM behavior, validate outputs, and recover from errors that stem from inconsistent output formatting, function and parameter hallucination, and lack of domain knowledge. Through extensive experiments we demonstrate MARCO’s superior performance with 94.48% and 92.74% accuracy on task execution for Digital Restaurant Service Platform conversations and Retail conversations datasets respectively along with 44.91% improved latency and 33.71% cost reduction in a production setting. We also report effects of guardrails in performance gain along with comparisons of various LLM models, both open-source and proprietary. The modular and generic design of MARCO allows it to be adapted for automating workflows across domains and to execute complex tasks through multi-turn interactions.

2023

pdf bib
Weakly supervised hierarchical multi-task classification of customer questions
Jitenkumar Rana | Promod Yenigalla | Chetan Aggarwal | Sandeep Sricharan Mukku | Manan Soni | Rashmi Patange
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)

Identifying granular and actionable topics from customer questions (CQ) posted on e-commerce websites helps surface the missing information expected by customers on the product detail page (DP), provide insights to brands and sellers on what critical product information that the customers are looking before making a purchase decision and helps enrich the catalog quality to improve the overall customer experience (CX). We propose a weakly supervised Hierarchical Multi-task Classification Framework (HMCF) to identify topics from customer questions at various granularities. Complexity lies in creating a list of granular topics (taxonomy) for 1000s of product categories and building a scalable classification system. To this end, we introduce a clustering based Taxonomy Creation and Data Labeling (TCDL) module for creating taxonomy and labelled data with minimal supervision. Using TCDL module, taxonomy and labelled data creation task reduces to 2 hours as compared to 2 weeks of manual efforts by a subject matter expert. For classification, we propose a two level HMCF that performs multi-class classification to identify coarse level-1 topic and leverages NLI based label-aware approach to identify granular level-2 topic. We showcase that HMCF (based on BERT and NLI) a) achieves absolute improvement of 13% in Top-1 accuracy over single-task non-hierarchical baselines b) learns a generic domain invariant function that can adapt to constantly evolving taxonomy (open label set) without need of re-training. c) reduces model deployment efforts significantly since it needs only one model that caters to 1000s of product categories.

pdf bib
InsightNet : Structured Insight Mining from Customer Feedback
Sandeep Sricharan Mukku | Manan Soni | Chetan Aggarwal | Jitenkumar Rana | Promod Yenigalla | Rashmi Patange | Shyam Mohan
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track

We propose InsightNet, a novel approach for the automated extraction of structured insights from customer reviews. Our end-to-end machine learning framework is designed to overcome the limitations of current solutions, including the absence of structure for identified topics, non-standard aspect names, and lack of abundant training data. The proposed solution builds a semi-supervised multi-level taxonomy from raw reviews, a semantic similarity heuristic approach to generate labelled data and employs a multi-task insight extraction architecture by fine-tuning an LLM. InsightNet identifies granular actionable topics with customer sentiments and verbatim for each topic. Evaluations on real-world customer review data show that InsightNet performs better than existing solutions in terms of structure, hierarchy and completeness. We empirically demonstrate that InsightNet outperforms the current state-of-the-art methods in multi-label topic classification, achieving an F1 score of 0.85, which is an improvement of 11% F1-score over the previous best results. Additionally, InsightNet generalises well for unseen aspects and suggests new topics to be added to the taxonomy.

2022

pdf bib
NER-MQMRC: Formulating Named Entity Recognition as Multi Question Machine Reading Comprehension
Anubhav Shrimal | Avi Jain | Kartik Mehta | Promod Yenigalla
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track

NER has been traditionally formulated as a sequence labeling task. However, there has been recent trend in posing NER as a machine reading comprehension task (Wang et al., 2020; Mengge et al., 2020), where entity name (or other information) is considered as a question, text as the context and entity value in text as answer snippet. These works consider MRC based on a single question (entity) at a time. We propose posing NER as a multi-question MRC task, where multiple questions (one question per entity) are considered at the same time for a single text. We propose a novel BERT-based multi-question MRC (NER-MQMRC) architecture for this formulation. NER-MQMRC architecture considers all entities as input to BERT for learning token embeddings with self-attention and leverages BERT-based entity representation for further improving these token embeddings for NER task. Evaluation on three NER datasets show that our proposed architecture leads to average 2.5 times faster training and 2.3 times faster inference as compared to NER-SQMRC framework based models by considering all entities together in a single pass. Further, we show that our model performance does not degrade compared to single-question based MRC (NER-SQMRC) (Devlin et al., 2019) leading to F1 gain of +0.41%, +0.32% and +0.27% for AE-Pub, Ecommerce5PT and Twitter datasets respectively. We propose this architecture primarily to solve large scale e-commerce attribute (or entity) extraction from unstructured text of a magnitude of 50k+ attributes to be extracted on a scalable production environment with high performance and optimised training and inference runtimes.

2020

pdf bib
AMUSED: A Multi-Stream Vector Representation Method for Use in Natural Dialogue
Gaurav Kumar | Rishabh Joshi | Jaspreet Singh | Promod Yenigalla
Proceedings of the Twelfth Language Resources and Evaluation Conference

The problem of building a coherent and non-monotonous conversational agent with proper discourse and coverage is still an area of open research. Current architectures only take care of semantic and contextual information for a given query and fail to completely account for syntactic and external knowledge which are crucial for generating responses in a chit-chat system. To overcome this problem, we propose an end to end multi-stream deep learning architecture that learns unified embeddings for query-response pairs by leveraging contextual information from memory networks and syntactic information by incorporating Graph Convolution Networks (GCN) over their dependency parse. A stream of this network also utilizes transfer learning by pre-training a bidirectional transformer to extract semantic representation for each input sentence and incorporates external knowledge through the neighborhood of the entities from a Knowledge Base (KB). We benchmark these embeddings on the next sentence prediction task and significantly improve upon the existing techniques. Furthermore, we use AMUSED to represent query and responses along with its context to develop a retrieval based conversational agent which has been validated by expert linguists to have comprehensive engagement with humans.