Effective Stemming for Arabic Information Retrieval

Youssef Kadri, Jian-Yun Nie


Abstract
Arabic has a very rich and complex morphology. Its appropriate morphological processing is very important for Information Retrieval (IR). In this paper, we propose a new stemming technique that tries to determine the stem of a word representing the semantic core of this word according to Arabic morphology. This method is compared to a commonly used light stemming technique which truncates a word by simple rules. Our tests on TREC collections show that the new stemming technique is more effective than the light stemming.
Anthology ID:
2006.bcs-1.6
Volume:
Proceedings of the International Conference on the Challenge of Arabic for NLP/MT
Month:
October 23
Year:
2006
Address:
London, UK
Venue:
BCS
SIG:
Publisher:
Note:
Pages:
68–75
Language:
URL:
https://aclanthology.org/2006.bcs-1.6
DOI:
Bibkey:
Cite (ACL):
Youssef Kadri and Jian-Yun Nie. 2006. Effective Stemming for Arabic Information Retrieval. In Proceedings of the International Conference on the Challenge of Arabic for NLP/MT, pages 68–75, London, UK.
Cite (Informal):
Effective Stemming for Arabic Information Retrieval (Kadri & Nie, BCS 2006)
Copy Citation:
PDF:
https://aclanthology.org/2006.bcs-1.6.pdf