Stefan Kemnitz
2025
Prompt Engineering for Nepali NER: Leveraging Hindi-Capable LLMs for Low-Resource Languages
Dipendra Yadav
|
Sumaiya Suravee
|
Stefan Kemnitz
|
Tobias Strauss
|
Kristina Yordanova
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
This study provides a systematic evaluation of prompt engineering strategies for Named Entity Recognition in Nepali, a low-resource language with high similarity to Hindi, by leveraging Hindi-capable Meta’s LLaMA 3.3:70B model. Four prompting techniques—Baseline, Chain-of-Thought, Self-Refine, and Least-toMost—are assessed in both zero-shot and fewshot settings. As a novel contribution, we propose an entity-aware sentence selection strategy that prioritizes example diversity and entity coverage for few-shot prompting. Experimental results show that, without Nepali examples, zero-shot and one-shot prompts frequently yield unstructured or hallucinated outputs, underscoring the limitations of cross-lingual capabilities without in-context supervision. However, including even a small number of carefully selected Nepali examples—sometimes as few as ten—substantially enhances model performance, with the Least-to-Most approach achieving the highest F1 scores. These findings highlight the potential of prompt-based adaptation and principled example curation for extending LLM capabilities to related, low-resource languages, offering a practical alternative to full model fine-tuning.