Vaibhav Mavi


2023

pdf bib
Retrieval-Augmented Chain-of-Thought in Semi-structured Domains
Vaibhav Mavi | Abulhair Saparov | Chen Zhao
Proceedings of the Natural Legal Language Processing Workshop 2023

Applying existing question answering (QA) systems to specialized domains like law and finance presents challenges that necessitate domain expertise. Although large language models (LLMs) have shown impressive language comprehension and in-context learning capabilities, their inability to handle very long inputs/contexts is well known. Tasks specific to these domains need significant background knowledge, leading to contexts that can often exceed the maximum length that existing LLMs can process. This study explores leveraging the semi-structured nature of legal and financial data to efficiently retrieve relevant context, enabling the use of LLMs for domain-specialized QA. The resulting system outperforms contemporary models and also provides useful explanations for the answers, encouraging the integration of LLMs into legal and financial NLP systems for future research.

2020

pdf bib
Semantic Extractor-Paraphraser based Abstractive Summarization
Anubhav Jangra | Raghav Jain | Vaibhav Mavi | Sriparna Saha | Pushpak Bhattacharyya
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

The anthology of spoken languages today is inundated with textual information, necessitating the development of automatic summarization models. In this manuscript, we propose an extractor-paraphraser based abstractive summarization system that exploits semantic overlap as opposed to its predecessors that focus more on syntactic information overlap. Our model outperforms the state-of-the-art baselines in terms of ROUGE, METEOR and word mover similarity (WMS), establishing the superiority of the proposed system via extensive ablation experiments. We have also challenged the summarization capabilities of the state of the art Pointer Generator Network (PGN), and through thorough experimentation, shown that PGN is more of a paraphraser, contrary to the prevailing notion of a summarizer; illustrating it’s incapability to accumulate information across multiple sentences.