Rahul Kumar


2024

pdf bib
BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain
Rahul Kumar | Amar Raja Dibbu | Shrutendra Harsola | Vignesh Subrahmaniam | Ashutosh Modi
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Several large-scale datasets (e.g., WikiSQL, Spider) for developing natural language interfaces to databases have recently been proposed. These datasets cover a wide breadth of domains but fall short on some essential domains, such as finance and accounting. Given that accounting databases are used worldwide, particularly by non-technical people, there is an imminent need to develop models that could help extract information from accounting databases via natural language queries. In this resource paper, we aim to fill this gap by proposing a new large-scale Text-to-SQL dataset for the accounting and financial domain: BookSQL. The dataset consists of 100k natural language queries-SQL pairs, and accounting databases of 1 million records. We experiment with and analyze existing state-of-the-art models (including GPT-4) for the Text-to-SQL task on BookSQL. We find significant performance gaps, thus pointing towards developing more focused models for this domain.

pdf bib
Preset-Voice Matching for Privacy Regulated Speech-to-Speech Translation Systems
Daniel Platnick | Bishoy Abdelnour | Eamon Earl | Rahul Kumar | Zahra Rezaei | Thomas Tsangaris | Faraj Lagum
Proceedings of the Fifth Workshop on Privacy in Natural Language Processing

In recent years, there has been increased demand for speech-to-speech translation (S2ST) systems in industry settings. Although successfully commercialized, cloning-based S2ST systems expose their distributors to liabilities when misused by individuals and can infringe on personality rights when exploited by media organizations. This work proposes a regulated S2ST framework called Preset-Voice Matching (PVM). PVM removes cross-lingual voice cloning in S2ST by first matching the input voice to a similar prior consenting speaker voice in the target-language. With this separation, PVM avoids cloning the input speaker, ensuring PVM systems comply with regulations and reduce risk of misuse. Our results demonstrate PVM can significantly improve S2ST system run-time in multi-speaker settings and the naturalness of S2ST synthesized speech. To our knowledge, PVM is the first explicitly regulated S2ST framework leveraging similarly-matched preset-voices for dynamic S2ST tasks.

2023

pdf bib
NLMs: Augmenting Negation in Language Models
Rituraj Singh | Rahul Kumar | Vivek Sridhar
Findings of the Association for Computational Linguistics: EMNLP 2023

Negation is the fundamental component in a natural language that reverses the semantic meaning of a sentence. It plays an extremely important role across a wide range of applications, yet they are underrepresented in pre-trained language models (LMs), resulting often in wrong inferences. In this work, we try to improve the underlying understanding of the negation in the pre-trained LMs. To augment negation understanding, we propose a language model objective with a weighted cross-entropy loss and elastic weight consolidation regularization. We reduce the mean top 1 error rate for BERT-base to 1.1%, BERT-large to 0.78%, RoBERTA-base to 3.74%, RoBERTA-large to 0.01% on the negated LAMA dataset. It minimizes the BERT error rate by a margin of 8% and also outperform the existing negation models. We also provide empirical evidences that negated augmented models outperform the classical models on original as well as negation benchmarks on natural language inference tasks.

2022

pdf bib
Many Hands Make Light Work: Using Essay Traits to Automatically Score Essays
Rahul Kumar | Sandeep Mathias | Sriparna Saha | Pushpak Bhattacharyya
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Most research in the area of automatic essay grading (AEG) is geared towards scoring the essay holistically while there has also been little work done on scoring individual essay traits. In this paper, we describe a way to score essays using a multi-task learning (MTL) approach, where scoring the essay holistically is the primary task, and scoring the essay traits is the auxiliary task. We compare our results with a single-task learning (STL) approach, using both LSTMs and BiLSTMs. To find out which traits work best for different types of essays, we conduct ablation tests for each of the essay traits. We also report the runtime and number of training parameters for each system. We find that MTL-based BiLSTM system gives the best results for scoring the essay holistically, as well as performing well on scoring the essay traits. The MTL systems also give a speed-up of between 2.30 to 3.70 times the speed of the STL system, when it comes to scoring the essay and all the traits.

2021

pdf bib
Semantics of Spatio-Directional Geometric Terms of Indian Languages
Sukhada Sukhada | Paul Soma | Rahul Kumar | Karthik Puranik
Proceedings of the 18th International Conference on Natural Language Processing (ICON)

This paper examines widely prevalent yet little-studied expressions in Indian languages which are known as geometrical terms be-cause “they engage locations along the axes of the reference object”. These terms are andara (inside), b ̄ahara (outside), ̄age (in front of), s ̄amane (in front of), p ̄ıche (back), ̄upara (above/over), n ̄ıce (under/below), d ̄ayem. (right), b ̄ayem. (left), p ̄asa (near), d ̄ura (away/far) in Hindi. The way these terms have been interpreted by the scholars of the Hindi language and handled in the Hindi Dependency treebank is misleading. This paper proposes an alternative analysis of these terms focusing on their triple – nominal, modifier and relational - functions and presents abstract semantic representations of these terms following the proposed analysis. The semantic representation will be explicit, unambiguous abstract and therefore universal in nature. The correspondence of these terms in Bangla and Kannada are also identified. Disambiguation of geometric terms will facilitate parsing and machine translation especially from Indian Language to English because these geometric terms of Indian languages are variedly translated in English de-pending on context.

2020

pdf bib
On-Device detection of sentence completion for voice assistants with low-memory footprint
Rahul Kumar | Vijeta Gour | Chandan Pandey | Godawari Sudhakar Rao | Priyadarshini Pai | Anmol Bhasin | Ranjan Samal
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

Sentence completion detection (SCD) is an important task for various downstream Natural Language Processing (NLP) based applications. For NLP based applications, which use the Automatic Speech Recognition (ASR) from third parties as a service, SCD is essential to prevent unnecessary processing. Conventional approaches for SCD operate within the confines of sentence boundary detection using language models or sentence end detection using speech and text features. These have limitations in terms of relevant available data for training, performance within the memory and latency constraints, and the generalizability across voice assistant domains. In this paper, we propose a novel sentence completion detection method with low memory footprint for On-Device applications. We explore various sequence-level and sentence-level experiments using state-of-the-art Bi-LSTM and BERT based models for English language.