Siba Sankar Sahu


2025

We explore and evaluate the effect of different language-independent stemmers in the information retrieval (IR) tasks with Indian languages such as Hindi, Gujarati, and English. The issue was examined from two points of view. Does a language-independent stemmer improve retrieval effectiveness in Indian languages IR? Which language-independent stemmer is the most suitable for different Indian languages? It is observed that stemming enhances retrieval efficiency in different Indian languages compared to the no stemming approaches. Among the different stemmers experimented with, the co-occurrence-based stemmer (SNS) performs the best and improves a mean average precision (MAP) score by 2.98% in Hindi, and 20.78% in Gujarati languages, respectively, whereas the graph-based stemmer (GRAS) performs the best and improves a MAP score by 5.83% in English.