Vivek Kumar


2024

pdf bib
Ask the experts: sourcing a high-quality nutrition counseling dataset through Human-AI collaboration
Simone Balloccu | Ehud Reiter | Karen Jia-Hui Li | Rafael Sargsyan | Vivek Kumar | Diego Reforgiato | Daniele Riboni | Ondrej Dusek
Findings of the Association for Computational Linguistics: EMNLP 2024

Large Language Models (LLMs) are being employed by end-users for various tasks, including sensitive ones such as health counseling, disregarding potential safety concerns. It is thus necessary to understand how adequately LLMs perform in such domains. We conduct a case study on ChatGPT in nutrition counseling, a popular use-case where the model supports a user with their dietary struggles. We crowd-source real-world diet-related struggles, then work with nutrition experts to generate supportive text using ChatGPT. Finally, experts evaluate the safety and text quality of ChatGPT’s output. The result is the HAI-coaching dataset, containing ~2.4K crowdsourced dietary struggles and ~97K corresponding ChatGPT-generated and expert-annotated supportive texts. We analyse ChatGPT’s performance, discovering potentially harmful behaviours, especially for sensitive topics like mental health. Finally, we use HAI-coaching to test open LLMs on various downstream tasks, showing that even the latest models struggle to achieve good performance. HAI-coaching is available at https://github.com/uccollab/hai-coaching/

2023

pdf bib
VISU at WASSA 2023 Shared Task: Detecting Emotions in Reaction to News Stories Using Transformers and Stacked Embeddings
Vivek Kumar | Prayag Tiwari | Sushmita Singh
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

Our system, VISU, participated in the WASSA 2023 Shared Task (3) of Emotion Classification from essays written in reaction to news articles. Emotion detection from complex dialogues is challenging and often requires context/domain understanding. Therefore in this research, we have focused on developing deep learning (DL) models using the combination of word embedding representations with tailored prepossessing strategies to capture the nuances of emotions expressed. Our experiments used static and contextual embeddings (individual and stacked) with Bidirectional Long short-term memory (BiLSTM) and Transformer based models. We occupied rank tenth in the emotion detection task by scoring a Macro F1-Score of 0.2717, validating the efficacy of our implemented approaches for small and imbalanced datasets with mixed categories of target emotions.

2022

pdf bib
Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers
Vivek Kumar | Rishabh Maheshwary | Vikram Pudi
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Existing Math Word Problem (MWP) solvers have achieved high accuracy on benchmark datasets. However, prior works have shown that such solvers do not generalize well and rely on superficial cues to achieve high performance. In this paper, we first conduct experiments to showcase that this behaviour is mainly associated with the limited size and diversity present in existing MWP datasets. Next, we propose several data augmentation techniques broadly categorized into Substitution and Paraphrasing based methods. By deploying these methods we increase the size of existing datasets by five folds. Extensive experiments on two benchmark datasets across three state-of-the-art MWP solvers shows that proposed methods increase the generalization and robustness of existing solvers. On average, proposed methods significantly increase the state-of-the-art results by over five percentage points on benchmark datasets. Further, the solvers trained on the augmented dataset performs comparatively better on the challenge test set. We also show the effectiveness of proposed techniques through ablation studies and verify the quality of augmented samples through human evaluation.

2021

pdf bib
Adversarial Examples for Evaluating Math Word Problem Solvers
Vivek Kumar | Rishabh Maheshwary | Vikram Pudi
Findings of the Association for Computational Linguistics: EMNLP 2021

Standard accuracy metrics have shown that Math Word Problem (MWP) solvers have achieved high performance on benchmark datasets. However, the extent to which existing MWP solvers truly understand language and its relation with numbers is still unclear. In this paper, we generate adversarial attacks to evaluate the robustness of state-of-the-art MWP solvers. We propose two methods, Question Reordering and Sentence Paraphrasing to generate adversarial attacks. We conduct experiments across three neural MWP solvers over two benchmark datasets. On average, our attack method is able to reduce the accuracy of MWP solvers by over 40% on these datasets. Our results demonstrate that existing MWP solvers are sensitive to linguistic variations in the problem text. We verify the validity and quality of generated adversarial examples through human evaluation.