M. Sharma Dipti


2023

pdf bib
Towards Large Language Model driven Reference-less Translation Evaluation for English and Indian Language
Mujadia Vandan | Mishra Pruthwik | Ahsan Arafat | M. Sharma Dipti
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

With the primary focus on evaluating the effectiveness of large language models for automatic reference-less translation assessment, this work presents our experiments on mimicking human direct assessment to evaluate the quality of translations in English and Indian languages. We constructed a translation evaluation task where we performed zero-shot learning, in-context example-driven learning, and fine-tuning of large language models to provide a score out of 100, where 100 represents a perfect translation and 1 represents a poor translation. We compared the performance of our trained systems with existing methods such as COMET, BERT-Scorer, and LABSE, and found that the LLM-based evaluator (LLaMA2-13B) achieves a comparable or higher overall correlation with human judgments for the considered Indian language pairs (Refer figure 1).

pdf bib
Verb Categorisation for Hindi Word Problem Solving
Sharma Harshita | Mishra Pruthwik | M. Sharma Dipti
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

Word problem Solving is a challenging NLP task that deals with solving mathematical probglems described in natural language. Recently, there has been renewed interest in developing word problem solvers for Indian languages. As part of this paper, we have built a Hindi arithmetic word problem solver which makes use of verbs. Additionally, we have created verb categorization data for Hindi. Verbs are very important for solving word problems with addition/subtraction operations as they help us identify the set of operations required to solve the word problems. We propose a rule-based solver that uses verb categorisation to identify operations in a word problem and generate answers for it. To perform verb categorisation, we explore several approaches and present a comparative study.

pdf bib
Automatic Data Retrieval for Cross Lingual Summarization
Bhatnagar Nikhilesh | Urlana Ashok | Mishra Pruthwik | Mujadia Vandan | M. Sharma Dipti
Proceedings of the 20th International Conference on Natural Language Processing (ICON)

Cross-lingual summarization involves the sum marization of text written in one language to a different one. There is a body of research addressing cross-lingual summarization from English to other European languages. In this work, we aim to perform cross-lingual summarization from English to Hindi. We propose pairing up the coverage of newsworthy events in textual and video format can prove to be helpful for data acquisition for cross lingual summarization. We analyze the data and propose methods to match articles to video descriptions that serve as document and summary pairs. We also outline filtering methods over reasonable thresholds to ensure the correctness of the summaries. Further, we make available 28,583 mono and cross-lingual article-summary pairs* . We also build and analyze multiple baselines on the collected data and report error analysis.