Parag Pravin Dakle

2025

pdf bib abs
Investigating the effectiveness of length based rewards in DPO for building Conversational Financial Question Answering Systems
Anushka Yadav | Sai Krishna Rallabandi | Parag Pravin Dakle | Preethi Raghavan
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)

In this paper, we address the numerical reasoning challenges of financial question-answering systems. We propose a two-stage approach where models first generate intermediate calculations and then produce the final answer. We perform two set of experiments to evaluate the performance of our approach. In the first, we compare single-step and multi-step approaches, demonstrating that incorporating intermediate calculations significantly improves numerical accuracy. In the second experiment, we compare traditional DPO and iterative DPO (iDPO) with length-regularized DPO. We show that while traditional DPO reduced parsing errors, it introduces verbosity; iDPO improves reasoning iteratively but faces diminishing returns. On the other hand, Length-regularized DPO reduces verbosity of intermediate calculation as well as enhances numerical accuracy across all models. These results highlight the potential of combining intermediate reasoning steps with domain-specific optimizations to build robust financial question-answering systems.

2024

pdf bib abs
Jetsons at FinNLP 2024: Towards Understanding the ESG Impact of a News Article Using Transformer-based Models
Parag Pravin Dakle | Alolika Gon | Sihan Zha | Liang Wang | Sai Krishna Rallabandi | Preethi Raghavan
Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing

In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task. The shared task focuses on predicting the duration and type of the ESG impact of a news article. The shared task dataset consists of 2,059 news titles and articles in English, French, Korean, and Japanese languages. For the impact duration classification task, we fine-tuned XLM-RoBERTa with a custom fine-tuning strategy and using self-training and DeBERTa-v3 using only English translations. These models individually ranked first on the leaderboard for Korean and Japanese and in an ensemble for the English language, respectively. For the impact type classification task, our XLM-RoBERTa model fine-tuned using a custom fine-tuning strategy ranked first for the English language.

2023

pdf bib
Jetsons at the FinNLP-2023: Using Synthetic Data and Transfer Learning for Multilingual ESG Issue Classification
Parker Glenn | Alolika Gon | Nikhil Kohli | Sihan Zha | Parag Pravin Dakle | Preethi Raghavan
Proceedings of the Fifth Workshop on Financial Technology and Natural Language Processing and the Second Multimodal AI For Financial Forecasting

pdf bib abs
Correcting Semantic Parses with Natural Language through Dynamic Schema Encoding
Parker Glenn | Parag Pravin Dakle | Preethi Raghavan
Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023)

In addressing the task of converting natural language to SQL queries, there are several semantic and syntactic challenges. It becomes increasingly important to understand and remedy the points of failure as the performance of semantic parsing systems improve. We explore semantic parse correction with natural language feedback, proposing a new solution built on the success of autoregressive decoders in text-to-SQL tasks. By separating the semantic and syntactic difficulties of the task, we show that the accuracy of text-to-SQL parsers can be boosted by up to 26% with only one turn of correction with natural language. Additionally, we show that a T5-base model is capable of correcting the errors of a T5-large model in a zero-shot, cross-parser setting.

2022

pdf bib abs
Jetsons at the FinNLP-2022 ERAI Task: BERT-Chinese for mining high MPP posts
Alolika Gon | Sihan Zha | Sai Krishna Rallabandi | Parag Pravin Dakle | Preethi Raghavan
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

In this paper, we discuss the various approaches by the Jetsons team for the “Pairwise Comparison” sub-task of the ERAI shared task to compare financial opinions for profitability and loss. Our BERT-Chinese model considers a pair of opinions and predicts the one with a higher maximum potential profit (MPP) with 62.07% accuracy. We analyze the performance of our approaches on both the MPP and maximal loss (ML) problems and deeply dive into why BERT-Chinese outperforms other models.

pdf bib abs
Using Transformer-based Models for Taxonomy Enrichment and Sentence Classification
Parag Pravin Dakle | Shrikumar Patil | Sai Krishna Rallabandi | Chaitra Hegde | Preethi Raghavan
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

In this paper, we present a system that addresses the taxonomy enrichment problem for Environment, Social and Governance issues in the financial domain, as well as classifying sentences as sustainable or unsustainable, for FinSim4-ESG, a shared task for the FinNLP workshop at IJCAI-2022. We first created a derived dataset for taxonomy enrichment by using a sentence-BERT-based paraphrase detector (Reimers and Gurevych, 2019) (on the train set) to create positive and negative term-concept pairs. We then model the problem by fine-tuning the sentence-BERT-based paraphrase detector on this derived dataset, and use it as the encoder, and use a Logistic Regression classifier as the decoder, resulting in test Accuracy: 0.6 and Avg. Rank: 1.97. In case of the sentence classification task, the best-performing classifier (Accuracy: 0.92) consists of a pre-trained RoBERTa model (Liu et al., 2019a) as the encoder and a Feed Forward Neural Network classifier as the decoder.

2020

pdf bib abs
CEREC: A Corpus for Entity Resolution in Email Conversations
Parag Pravin Dakle | Dan Moldovan
Proceedings of the 28th International Conference on Computational Linguistics

We present the first large scale corpus for entity resolution in email conversations (CEREC). The corpus consists of 6001 email threads from the Enron Email Corpus containing 36,448 email messages and 38,996 entity coreference chains. The annotation is carried out as a two-step process with minimal manual effort. Experiments are carried out for evaluating different features and performance of four baselines on the created corpus. For the task of mention identification and coreference resolution, a best performance of 54.1 F1 is reported, highlighting the room for improvement. An in-depth qualitative and quantitative error analysis is presented to understand the limitations of the baselines considered.

pdf bib abs
A Study on Entity Resolution for Email Conversations
Parag Pravin Dakle | Takshak Desai | Dan Moldovan
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper investigates the problem of entity resolution for email conversations and presents a seed annotated corpus of email threads labeled with entity coreference chains. Characteristics of email threads concerning reference resolution are first discussed, and then the creation of the corpus and annotation steps are explained. Finally, performance of the current state-of-the-art deep learning models on the seed corpus is evaluated and qualitative error analysis on the predictions obtained is presented.

pdf bib abs
Joint Learning of Syntactic Features Helps Discourse Segmentation
Takshak Desai | Parag Pravin Dakle | Dan Moldovan
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper describes an accurate framework for carrying out multi-lingual discourse segmentation with BERT (Devlin et al., 2019). The model is trained to identify segments by casting the problem as a token classification problem and jointly learning syntactic features like part-of-speech tags and dependency relations. This leads to significant improvements in performance. Experiments are performed in different languages, such as English, Dutch, German, Portuguese Brazilian and Basque to highlight the cross-lingual effectiveness of the segmenter. In particular, the model achieves a state-of-the-art F-score of 96.7 for the RST-DT corpus (Carlson et al., 2003) improving on the previous best model by 7.2%. Additionally, a qualitative explanation is provided for how proposed changes contribute to model performance by analyzing errors made on the test data.

Co-authors

Venues

Fix data