Anirban Majumder
2024
PEARL: Preference Extraction with Exemplar Augmentation and Retrieval with LLM Agents
Vijit Malik
|
Akshay Jagatap
|
Vinayak S Puranik
|
Anirban Majumder
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track
Identifying preferences of customers in their shopping journey is a pivotal aspect in providing product recommendations. The task becomes increasingly challenging when there is a multi-turn conversation between the user and a shopping assistant chatbot. In this paper, we tackle a novel and complex problem of identifying customer preferences in the form of key-value filters on an e-commerce website in a multi-turn conversational setting. Existing systems specialize in extracting customer preferences from standalone customer queries which makes them unsuitable to multi-turn setup. We propose PEARL (Preference Extraction with ICL Augmentation and Retrieval with LLM Agents) that leverages collaborative LLM agents, generates in-context learning exemplars and dynamically retrieves relevant exemplars during inference time to extract customer preferences as a combination of key-value filters. Our experiments on proprietary and public datasets show that PEARL not only improves performance on exact match by ~10% compared to competitive LLM-based baselines but additionally improves inference latency by ~110%.
2023
PROTEGE: Prompt-based Diverse Question Generation from Web Articles
Vinayak Puranik
|
Anirban Majumder
|
Vineet Chaoji
Findings of the Association for Computational Linguistics: EMNLP 2023
Rich and diverse knowledge bases (KB) are foundational building blocks for online knowledge sharing communities such as StackOverflow and Quora, and applications such as conversational assistants (aka chatbots). A popular format for knowledge bases is question-answer pairs (or FAQs), where questions are designed to accurately match a multitude of queries. In this paper, we address the problem of automatic creation of such Q&A-based knowledge bases from domain-specific, long-form textual content (e.g., web articles). Specifically, we consider the problem of question generation, which is the task of generating questions given a paragraph of text as input, with a goal to achieve both diversity and fidelity of the generated questions. Towards this goal we propose PROTEGE, a diverse question generation framework which consists of (1) a novel encoder-decoder based Large Language Model (LLM) architecture which can take a variety of prompts and generate a diverse set of candidate questions, and (2) a hill-climbing algorithm that maximizes a sub-modular objective function to balance diversity with fidelity. Through our experiments on three popular public Q&A datasets, we demonstrate that PROTEGE improves diversity by +16% and fidelity by +8% over diverse beam search and prompt-based baselines.
2021
Deep Embedding of Conversation Segments
Abir Chakraborty
|
Anirban Majumder
Proceedings of the 18th International Conference on Natural Language Processing (ICON)
We introduce a novel conversation embedding by extending Bidirectional Encoder Representations from Transformers (BERT) framework. Specifically, information related to “turn” and “role” that are unique to conversations are augmented to the word tokens and the next sentence prediction task predicts a segment of a conversation possibly spanning across multiple roles and turns. It is observed that the addition of role and turn substantially increases the next sentence prediction accuracy. Conversation embeddings obtained in this fashion are applied to (a) conversation clustering, (b) conversation classification and (c) as a context for automated conversation generation on new datasets (unseen by the pre-training model). We found that clustering accuracy is greatly improved if embeddings are used as features as opposed to conventional tf-idf based features that do not take role or turn information into account. On classification task, a fine-tuned model on conversation embedding achieves accuracy comparable to an optimized linear SVM model on tf-idf based features. Finally, we present a way of capturing variable length context in sequence-to-sequence models by utilizing this conversation embedding and show that BLEU score improves over a vanilla sequence to sequence model without context.
Search
Co-authors
- Vijit Malik 1
- Akshay Jagatap 1
- Vinayak S Puranik 1
- Abir Chakraborty 1
- Vinayak Puranik 1
- show all...