2023
pdf
bib
abs
Few-shot Unified Question Answering: Tuning Models or Prompts?
Srijan Bansal
|
Semih Yavuz
|
Bo Pang
|
Meghana Bhat
|
Yingbo Zhou
Findings of the Association for Computational Linguistics: EMNLP 2023
Question-answering (QA) tasks often investigate specific question types, knowledge domains, or reasoning skills, leading to specialized models catering to specific categories of QA tasks. While recent research has explored the idea of unified QA models, such models are usually explored for high-resource scenarios and require re-training to extend their capabilities. To overcome these drawbacks, the paper explores the potential of two paradigms of tuning, model, and prompts, for unified QA under a low-resource setting. The paper provides an exhaustive analysis of their applicability using 16 QA datasets, revealing that prompt tuning can perform as well as model tuning in a few-shot setting with a good initialization. The study also shows that parameter-sharing results in superior few-shot performance, simple knowledge transfer techniques for prompt initialization can be effective, and prompt tuning achieves a significant performance boost from pre-training in a low-resource regime. The research offers insights into the advantages and limitations of prompt tuning for unified QA in a few-shot setting, contributing to the development of effective and efficient systems in low-resource scenarios.
pdf
bib
abs
PEFTDebias : Capturing debiasing information using PEFTs
Sumit Agarwal
|
Aditya Veerubhotla
|
Srijan Bansal
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
The increasing use of foundation models highlights the urgent need to address and eliminate implicit biases present in them that arise during pretraining. In this paper, we introduce PEFTDebias, a novel approach that employs parameter-efficient fine-tuning (PEFT) to mitigate the biases within foundation models. PEFTDebias consists of two main phases: an upstream phase for acquiring debiasing parameters along a specific bias axis, and a downstream phase where these parameters are incorporated into the model and frozen during the fine-tuning process. By evaluating on four datasets across two bias axes namely gender and race, we find that downstream biases can be effectively reduced with PEFTs. In addition, we show that these parameters possess axis-specific debiasing characteristics, enabling their effective transferability in mitigating biases in various downstream tasks.
pdf
bib
abs
Language-Agnostic Transformers and Assessing ChatGPT-Based Query Rewriting for Multilingual Document-Grounded QA
Srinivas Gowriraj
|
Soham Dinesh Tiwari
|
Mitali Potnis
|
Srijan Bansal
|
Teruko Mitamura
|
Eric Nyberg
Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
The DialDoc 2023 shared task has expanded the document-grounded dialogue task to encompass multiple languages, despite having limited annotated data. This paper assesses the effectiveness of both language-agnostic and language-aware paradigms for multilingual pre-trained transformer models in a bi-encoder-based dense passage retriever (DPR), concluding that the language-agnostic approach is superior. Additionally, the study investigates the impact of query rewriting techniques using large language models, such as ChatGPT, on multilingual, document-grounded question-answering systems. The experiments conducted demonstrate that, for the examples examined, query rewriting does not enhance performance compared to the original queries. This failure is due to topic switching in final dialogue turns and irrelevant topics being considered for query rewriting.
2022
pdf
bib
abs
R3 : Refined Retriever-Reader pipeline for Multidoc2dial
Srijan Bansal
|
Suraj Tripathi
|
Sumit Agarwal
|
Sireesh Gururaja
|
Aditya Srikanth Veerubhotla
|
Ritam Dutt
|
Teruko Mitamura
|
Eric Nyberg
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering
In this paper, we present our submission to the DialDoc shared task based on the MultiDoc2Dial dataset. MultiDoc2Dial is a conversational question answering dataset that grounds dialogues in multiple documents. The task involves grounding a user’s query in a document followed by generating an appropriate response. We propose several improvements over the baseline’s retriever-reader architecture to aid in modeling goal-oriented dialogues grounded in multiple documents. Our proposed approach employs sparse representations for passage retrieval, a passage re-ranker, the fusion-in-decoder architecture for generation, and a curriculum learning training paradigm. Our approach shows a 12 point improvement in BLEU score compared to the baseline RAG model.
pdf
bib
abs
PRO-CS : An Instance-Based Prompt Composition Technique for Code-Switched Tasks
Srijan Bansal
|
Suraj Tripathi
|
Sumit Agarwal
|
Teruko Mitamura
|
Eric Nyberg
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Code-switched (CS) data is ubiquitous in today’s globalized world, but the dearth of annotated datasets in code-switching poses a significant challenge for learning diverse tasks across different language pairs. Parameter-efficient prompt-tuning approaches conditioned on frozen language models have shown promise for transfer learning in limited-resource setups. In this paper, we propose a novel instance-based prompt composition technique, PRO-CS, for CS tasks that combine language and task knowledge. We compare our approach with prompt-tuning and fine-tuning for code-switched tasks on 10 datasets across 4 language pairs. Our model outperforms the prompt-tuning approach by significant margins across all datasets and outperforms or remains at par with fine-tuning by using just 0.18% of total parameters. We also achieve competitive results when compared with the fine-tuned model in the low-resource cross-lingual and cross-task setting, indicating the effectiveness of our approach to incorporate new code-switched tasks.
2020
pdf
bib
abs
Code-Switching Patterns Can Be an Effective Route to Improve Performance of Downstream NLP Applications: A Case Study of Humour, Sarcasm and Hate Speech Detection
Srijan Bansal
|
Vishal Garimella
|
Ayush Suhane
|
Jasabanta Patro
|
Animesh Mukherjee
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
In this paper, we demonstrate how code-switching patterns can be utilised to improve various downstream NLP applications. In particular, we encode various switching features to improve humour, sarcasm and hate speech detection tasks. We believe that this simple linguistic observation can also be potentially helpful in improving other similar NLP applications.
2019
pdf
bib
abs
A deep-learning framework to detect sarcasm targets
Jasabanta Patro
|
Srijan Bansal
|
Animesh Mukherjee
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
In this paper we propose a deep learning framework for sarcasm target detection in predefined sarcastic texts. Identification of sarcasm targets can help in many core natural language processing tasks such as aspect based sentiment analysis, opinion mining etc. To begin with, we perform an empirical study of the socio-linguistic features and identify those that are statistically significant in indicating sarcasm targets (p-values in the range(0.05,0.001)). Finally, we present a deep-learning framework augmented with socio-linguistic features to detect sarcasm targets in sarcastic book-snippets and tweets. We achieve a huge improvement in the performance in terms of exact match and dice scores compared to the current state-of-the-art baseline.