Vishwa Shah

2025

pdf bib abs
NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models
Abhinav Sukumar Rao | Akhila Yerukola | Vishwa Shah | Katharina Reinecke | Maarten Sap
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

To be effectively and safely deployed to global user populations, large language models (LLMs) may need to adapt outputs to user values and cultures, not just know about them. We introduce NormAd, an evaluation framework to assess LLMs’ cultural adaptability, specifically measuring their ability to judge social acceptability across varying levels of cultural norm specificity, from abstract values to explicit social norms. As an instantiation of our framework, we create NormAd-Eti, a benchmark of 2.6k situational descriptions representing social-etiquette related cultural norms from 75 countries. Through comprehensive experiments on NormAd-Eti, we find that LLMs struggle to accurately judge social acceptability across these varying degrees of cultural contexts and show stronger adaptability to English-centric cultures over those from the Global South. Even in the simplest setting where the relevant social norms are provided, the best LLMs’ performance (\textless 82%) lags behind humans (\textgreater 95%). In settings with abstract values and country information, model performance drops substantially (\textless 60%), while human accuracy remains high (\textgreater90%). Furthermore, we find that models are better at recognizing socially acceptable versus unacceptable situations. Our findings showcase the current pitfalls in socio-cultural reasoning of LLMs which hinder their adaptability for global audiences.

2024

pdf bib abs
AdaPT: A Set of Guidelines for Hyperbolic Multimodal Multilingual NLP
Ramit Sawhney | Shrey Pandit | Vishwa Shah | Megh Thakkar | Shafiq Joty
Findings of the Association for Computational Linguistics: NAACL 2024

The Euclidean space is the familiar space for training neural models and performing arithmetic operations.However, many data types inherently possess complex geometries, and model training methods involve operating over their latent representations, which cannot be effectively captured in the Euclidean space.The hyperbolic space provides a more generalized representative geometry to model the hierarchical complexities of the tree-like structure of natural language.We propose AdaPT a set of guidelines for initialization, parametrization, and training of neural networks, which adapts to the dataset and can be used with different manifolds. AdaPT can be generalized over any existing neural network training methodology and leads to more stable training without a substantial increase in training time.We apply AdaPT guidelines over two state-of-the-art deep learning approaches and empirically demonstrate its effectiveness through experiments on three tasks over 12 languages across speech and text.Through extensive qualitative analysis, we put forward the applicability of AdaPT as a set of guidelines optimally utilizing the manifold geometry, which can be extended to various downstream tasks across languages and modalities.

pdf bib abs
Calc-CMU at SemEval-2024 Task 7: Pre-Calc - Learning to Use the Calculator Improves Numeracy in Language Models
Vishruth Veerendranath | Vishwa Shah | Kshitish Ghate
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

Quantitative and numerical comprehension in language is an important task in many fields like education and finance, but still remains a challenging task for language models. While tool and calculator usage has shown to be helpful to improve mathematical reasoning in large pretrained decoder-only language models, this remains unexplored for smaller language models with encoders. In this paper, we propose Pre-Calc, a simple pre-finetuning objective of learning to use the calculator for both encoder-only and encoder-decoder architectures, formulated as a discriminative and generative task respectively. We pre-train BERT and RoBERTa for discriminative calculator use and Flan-T5 for generative calculator use on the MAWPS, SVAMP, and AsDiv-A datasets, which improves performance on downstream tasks that require numerical understanding. Our code and data are available at https://github.com/calc-cmu/pre-calc.

Co-authors

Shrey Pandit 1

Abhinav Sukumar Rao 1

Katharina Reinecke 1

Maarten Sap 1

Vishruth Veerendranath 1

Akhila Yerukola 1

Venues

Fix author