IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

Sankalp Jajee; Ashutosh Kumar; Nikunj Kotecha; Vinija Jain; Aman Chadha; Sreyoshi Bhaduri

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

Sankalp Jajee, Ashutosh Kumar, Nikunj Kotecha, Vinija Jain, Aman Chadha, Sreyoshi Bhaduri

Abstract

Indic languages, spoken by over 1.5 billion people, pose unique challenges for NLP due to their cultural richness, linguistic diversity, and structural complexity. We present IndicMMLU-Pro, a comprehensive benchmark extending the MMLU-Pro framework to nine major Indic languages: Hindi, Bengali, Gujarati, Marathi, Kannada, Punjabi, Tamil, Telugu, and Urdu. Covering a wide range of tasks in comprehension, reasoning, and generation, IndicMMLU-Pro offers a standardized evaluation framework to advance AI model development in Indic contexts. This paper details the benchmark’s design, taxonomy, and data curation, and establishes baseline results using state-of-the-art multilingual models. As an open resource IndicMMLU-Pro aims to accelerate progress in Indic language technologies and support inclusive research in multilingual NLP.

Anthology ID:: 2026.gem-main.10
Volume:: Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Simon Mille, Sebastian Gehrmann, Patrícia Schmidtová, Ondřej Dušek, Marzieh Fadaee, Kyle Lo, Enrico Santus, Gabriel Stanovsky
Venues:: GEM | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 102–111
Language:
URL:: https://aclanthology.org/2026.gem-main.10/
DOI:
Bibkey:
Cite (ACL):: Sankalp Jajee, Ashutosh Kumar, Nikunj Kotecha, Vinija Jain, Aman Chadha, and Sreyoshi Bhaduri. 2026. IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding. In Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM), pages 102–111, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding (Jajee et al., GEM 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.gem-main.10.pdf

PDF Cite Search Fix data