Rahul Bhagat
2025
Learning to Rewrite Negation Queries in Product Search
Mengtian Guo | Mutasem Al-Darabsah | Choon Hui Teo | Jonathan May | Tarun Agarwal | Rahul Bhagat
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
Mengtian Guo | Mutasem Al-Darabsah | Choon Hui Teo | Jonathan May | Tarun Agarwal | Rahul Bhagat
Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
In product search, negation is frequently used to articulate unwanted product features or components. Modern search engines often struggle to comprehend negations, resulting in suboptimal user experiences. While various methods have been proposed to tackle negations in search, none of them took the vocabulary gap between query keywords and product text into consideration. In this work, we introduced a query rewriting approach to enhance the performance of product search engines when dealing with queries with negations. First, we introduced a data generation workflow that leverages large language models (LLMs) to extract query rewrites from product text. Subsequently, we trained a Seq2Seq model to generate query rewrite for unseen queries. Our experiments demonstrated that query rewriting yields a 3.17% precision@30 improvement for queries with negations. The promising results pave the way for further research on enhancing the search performance of queries with negations.
2023
CUPID: Curriculum Learning Based Real-Time Prediction using Distillation
Arindam Bhattacharya | Ankith Ms | Ankit Gandhi | Vijay Huddar | Atul Saroop | Rahul Bhagat
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Arindam Bhattacharya | Ankith Ms | Ankit Gandhi | Vijay Huddar | Atul Saroop | Rahul Bhagat
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Relevance in E-commerce Product Search is crucial for providing customers with accurate results that match their query intent. With recent advancements in NLP and Deep Learning, Transformers have become the default choice for relevance classification tasks. In such a setting, the relevance model uses query text and product title as input features, and estimates if the product is relevant for the customer query. While cross-attention in Transformers enables a more accurate relevance prediction in such a setting, its high evaluation latency makes it unsuitable for real-time predictions in which thousands of products must be evaluated against a user query within few milliseconds. To address this issue, we propose CUPID: a Curriculum learning based real-time Prediction using Distillation that utilizes knowledge distillation within a curriculum learning setting to learn a simpler architecture that can be evaluated within low latency budgets. In a bi-lingual relevance prediction task, our approach shows an 302 bps improvement on English and 676 bps improvement for low-resource Arabic, while maintaining the low evaluation latency on CPUs.
2022
Augmenting Training Data for Massive Semantic Matching Models in Low-Traffic E-commerce Stores
Ashutosh Joshi | Shankar Vishwanath | Choon Teo | Vaclav Petricek | Vishy Vishwanathan | Rahul Bhagat | Jonathan May
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track
Ashutosh Joshi | Shankar Vishwanath | Choon Teo | Vaclav Petricek | Vishy Vishwanathan | Rahul Bhagat | Jonathan May
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track
Extreme multi-label classification (XMC) systems have been successfully applied in e-commerce (Shen et al., 2020; Dahiya et al., 2021) for retrieving products based on customer behavior. Such systems require large amounts of customer behavior data (e.g. queries, clicks, purchases) for training. However, behavioral data is limited in low-traffic e-commerce stores, impacting performance of these systems. In this paper, we present a technique that augments behavioral training data via query reformulation. We use the Aggregated Label eXtreme Multi-label Classification (AL-XMC) system (Shen et al., 2020) as an example semantic matching model and show via crowd-sourced human judgments that, when the training data is augmented through query reformulations, the quality of AL-XMC improves over a baseline that does not use query reformulation. We also show in online A/B tests that our method significantly improves business metrics for the AL-XMC model.
2013
Squibs: What Is a Paraphrase?
Rahul Bhagat | Eduard Hovy
Computational Linguistics, Volume 39, Issue 3 - September 2013
Rahul Bhagat | Eduard Hovy
Computational Linguistics, Volume 39, Issue 3 - September 2013
2008
Weakly-Supervised Acquisition of Labeled Class Instances using Graph Random Walks
Partha Pratim Talukdar | Joseph Reisinger | Marius Paşca | Deepak Ravichandran | Rahul Bhagat | Fernando Pereira
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
Partha Pratim Talukdar | Joseph Reisinger | Marius Paşca | Deepak Ravichandran | Rahul Bhagat | Fernando Pereira
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
Large Scale Acquisition of Paraphrases for Learning Surface Patterns
Rahul Bhagat | Deepak Ravichandran
Proceedings of ACL-08: HLT
Rahul Bhagat | Deepak Ravichandran
Proceedings of ACL-08: HLT
2007
LEDIR: An Unsupervised Algorithm for Learning Directionality of Inference Rules
Rahul Bhagat | Patrick Pantel | Eduard Hovy
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
Rahul Bhagat | Patrick Pantel | Eduard Hovy
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)
ISP: Learning Inferential Selectional Preferences
Patrick Pantel | Rahul Bhagat | Bonaventura Coppola | Timothy Chklovski | Eduard Hovy
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference
Patrick Pantel | Rahul Bhagat | Bonaventura Coppola | Timothy Chklovski | Eduard Hovy
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference
2005
Search
Fix author
Co-authors
- Eduard Hovy 4
- Jonathan May 2
- Patrick Pantel 2
- Deepak Ravichandran 2
- Tarun Agarwal 1
- Mutasem Al-Darabsah 1
- Arindam Bhattacharya 1
- Timothy Chklovski 1
- Bonaventura Coppola 1
- Ankit Gandhi 1
- Mengtian Guo 1
- Vijay Huddar 1
- Ashutosh Joshi 1
- Anton Leuski 1
- Ankith Ms 1
- Marius Pasca 1
- Fernando Pereira 1
- Vaclav Petricek 1
- Joseph Reisinger 1
- Atul Saroop 1
- Partha Talukdar 1
- Choon Teo 1
- Choon Hui Teo 1
- Shankar Vishwanath 1
- Vishy Vishwanathan 1