"So, How Much Do LLMs Hallucinate on Low-Resource Languages?" A Quantitative and Qualitative Analysis

Kushal Trivedi, Murtuza Shaikh, Sriyansh Sharma


Abstract
Language models have recently gained significant attention in natural language processing, showing strong performance across a wide range of tasks such as text classification, text generation, language modeling, and question answering (Q A). Despite these advances, one of the most critical challenges faced by language models is hallucination — the generation of fluent and plausible responses that are factually incorrect or fabricated. This study presents preliminary work on analyzing hallucinations in Q A tasks for low-resource languages. We evaluate model performance on the Mpox-Myanmar and SynDARin datasets using three API-accessible models: LLaMA 3.1 70B, LLaMA 3.1 8B, and Gemini 2.5 — and two monolingual language models: HyGPT 10B for Armenian and SeaLLM for Burmese. Our work contributes by systematically examining hallucinations through quantitative analysis using Natural Language Inference and Semantic Similarity metrics across different model sizes and prompting strategies, as well as qualitative analysis through human verification. We further investigate whether common assumptions about model behavior hold consistently and provide explanations for the observed patterns.
Anthology ID:
2026.loreslm-1.24
Volume:
Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Hansi Hettiarachchi, Tharindu Ranasinghe, Alistair Plum, Paul Rayson, Ruslan Mitkov, Mohamed Gaber, Damith Premasiri, Fiona Anting Tan, Lasitha Uyangodage
Venue:
LoResLM
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
271–287
Language:
URL:
https://aclanthology.org/2026.loreslm-1.24/
DOI:
Bibkey:
Cite (ACL):
Kushal Trivedi, Murtuza Shaikh, and Sriyansh Sharma. 2026. "So, How Much Do LLMs Hallucinate on Low-Resource Languages?" A Quantitative and Qualitative Analysis. In Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026), pages 271–287, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
“So, How Much Do LLMs Hallucinate on Low-Resource Languages?” A Quantitative and Qualitative Analysis (Trivedi et al., LoResLM 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.loreslm-1.24.pdf