Pranamya Patil

2024

pdf bib abs
SRIB-NMT’s Submission to the Indic MT Shared Task in WMT 2024
Pranamya Patil | Raghavendra Hr | Aditya Raghuwanshi | Kushal Verma
Proceedings of the Ninth Conference on Machine Translation

In the context of the Indic Low Resource Ma-chine Translation (MT) challenge at WMT-24, we participated in four language pairs:English-Assamese (en-as), English-Mizo (en-mz), English-Khasi (en-kh), and English-Manipuri (en-mn). To address these tasks,we employed a transformer-based sequence-to-sequence architecture (Vaswani et al., 2017).In the PRIMARY system, which did not uti-lize external data, we first pretrained languagemodels (low resource languages) using avail-able monolingual data before finetuning themon small parallel datasets for translation. Forthe CONTRASTIVE submission approach, weutilized pretrained translation models like In-dic Trans2 (Gala et al., 2023) and appliedLoRA Fine-tuning (Hu et al., 2021) to adaptthem to smaller, low-resource languages, aim-ing to leverage cross-lingual language transfercapabilities (CONNEAU and Lample, 2019).These approaches resulted in significant im-provements in SacreBLEU scores(Post, 2018)for low-resource languages.

2022

pdf bib abs
KILDST: Effective Knowledge-Integrated Learning for Dialogue State Tracking using Gazetteer and Speaker Information
Hyungtak Choi | Hyeonmok Ko | Gurpreet Kaur | Lohith Ravuru | Kiranmayi Gandikota | Manisha Jhawar | Simma Dharani | Pranamya Patil
Proceedings of the 19th International Conference on Natural Language Processing (ICON)

Dialogue State Tracking (DST) is core research in dialogue systems and has received much attention. In addition, it is necessary to define a new problem that can deal with dialogue between users as a step toward the conversational AI that extracts and recommends information from the dialogue between users. So, we introduce a new task - DST from dialogue between users about scheduling an event (DST-USERS). The DST-USERS task is much more challenging since it requires the model to understand and track dialogue states in the dialogue between users, as well as to understand who suggested the schedule and who agreed to the proposed schedule. To facilitate DST-USERS research, we develop dialogue datasets between users that plan a schedule. The annotated slot values which need to be extracted in the dialogue are date, time, and location. Previous approaches, such as Machine Reading Comprehension (MRC) and traditional DST techniques, have not achieved good results in our extensive evaluations. By adopting the knowledge-integrated learning method, we achieve exceptional results. The proposed model architecture combines gazetteer features and speaker information efficiently. Our evaluations of the dialogue datasets between users that plan a schedule show that our model outperforms the baseline model.

pdf bib abs
Efficient Dialog State Tracking Using Gated- Intent based Slot Operation Prediction for On-device Dialog Systems
Pranamya Patil | Hyungtak Choi | Ranjan Samal | Gurpreet Kaur | Manisha Jhawar | Aniruddha Tammewar | Siddhartha Mukherjee
Proceedings of the 19th International Conference on Natural Language Processing (ICON)

Conversational agents on smart devices need to be efficient concerning latency in responding, for enhanced user experience and real-time utility. This demands on-device processing (as on-device processing is quicker), which limits the availability of resources such as memory and processing. Most state-of-the-art Dialog State Tracking (DST) systems make use of large pre-trained language models that require high resource computation, typically available on high-end servers. Whereas, on-device systems are memory efficient, have reduced time/latency, preserve privacy, and don’t rely on network. A recent approach tries to reduce the latency by splitting the task of slot prediction into two subtasks of State Operation Prediction (SOP) to select an action for each slot, and Slot Value Generation (SVG) responsible for producing values for the identified slots. SVG being computationally expensive, is performed only for a small subset of actions predicted in the SOP. Motivated from this optimization technique, we build a similar system and work on multi-task learning to achieve significant improvements in DST performance, while optimizing the resource consumption. We propose a quadruplet (Domain, Intent, Slot, and Slot Value) based DST, which significantly boosts the performance. We experiment with different techniques to fuse different layers of representations from intent and slot prediction tasks. We obtain the best joint accuracy of 53.3% on the publicly available MultiWOZ 2.2 dataset, using BERT-medium along with a gating mechanism. We also compare the cost efficiency of our system with other large models and find that our system is best suited for an on-device based production environment.

Co-authors

Raghavendra Hr 1

Hyeonmok Ko 1

Siddhartha Mukherjee 1

Venues

Fix author