Mehar Bhatia


2024

pdf bib
From Local Concepts to Universals: Evaluating the Multicultural Understanding of Vision-Language Models
Mehar Bhatia | Sahithya Ravi | Aditya Chinchure | EunJeong Hwang | Vered Shwartz
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Despite recent advancements in vision-language models, their performance remains suboptimal on images from non-western cultures due to underrepresentation in training datasets. Various benchmarks have been proposed to test models’ cultural inclusivity. Still, they have limited coverage of cultures and do not adequately assess cultural diversity across universal and culture-specific local concepts. To address these limitations, we introduce the GlobalRG benchmark, comprising two challenging tasks: retrieval across universals and cultural visual grounding. The former task entails retrieving culturally diverse images for universal concepts from 50 countries, while the latter aims at grounding culture-specific concepts within images from 15 countries. Our evaluation across a wide range of models reveals that the performance varies significantly across cultures – underscoring the necessity for enhancing multicultural understanding in vision-language models.

2023

pdf bib
GD-COMET: A Geo-Diverse Commonsense Inference Model
Mehar Bhatia | Vered Shwartz
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

With the increasing integration of AI into everyday life, it’s becoming crucial to design AI systems to serve users from diverse backgrounds by making them culturally aware. In this paper, we present GD-COMET, a geo-diverse version of the COMET commonsense inference model. GD-COMET goes beyond Western commonsense knowledge and is capable of generating inferences pertaining to a broad range of cultures. We demonstrate the effectiveness of GD-COMET through a comprehensive human evaluation across 5 diverse cultures, as well as extrinsic evaluation on a geo-diverse task. The evaluation shows that GD-COMET captures and generates culturally nuanced commonsense knowledge, demonstrating its potential to benefit NLP applications across the board and contribute to making NLP more inclusive.

2019

pdf bib
A Survey on Ontology Enrichment from Text
Vivek Iyer | Lalit Mohan | Mehar Bhatia | Y. Raghu Reddy
Proceedings of the 16th International Conference on Natural Language Processing

Increased internet bandwidth at low cost is leading to the creation of large volumes of unstructured data. This data explosion opens up opportunities for the creation of a variety of data-driven intelligent systems, such as the Semantic Web. Ontologies form one of the most crucial layers of semantic web, and the extraction and enrichment of ontologies given this data explosion becomes an inevitable research problem. In this paper, we survey the literature on semi-automatic and automatic ontology extraction and enrichment and classify them into four broad categories based on the approach. Then, we proceed to narrow down four algorithms from each of these categories, implement and analytically compare them based on parameters like context relevance, efficiency and precision. Lastly, we propose a Long Short Term Memory Networks (LSTM) based deep learning approach to try and overcome the gaps identified in these approaches.