Language Modelling with NMT Query Translation for Amharic-Arabic Cross-Language Information Retrieval

Ibrahim Gashaw; H. L. Shashirekha

Language Modelling with NMT Query Translation for Amharic-Arabic Cross-Language Information Retrieval

Abstract

This paper describes our first experiment on Neural Machine Translation (NMT) based query translation for Amharic-Arabic Cross-Language Information Retrieval (CLIR) task to retrieve relevant documents from Amharic and Arabic text collections in response to a query expressed in the Amharic language. We used a pre-trained NMT model to map a query in the source language into an equivalent query in the target language. The relevant documents are then retrieved using a Language Modeling (LM) based retrieval algorithm. Experiments are conducted on four conventional IR models, namely Uni-gram and Bi-gram LM, Probabilistic model, and Vector Space Model (VSM). The results obtained illustrate that the proposed Uni-gram LM outperforms all other models for both Amharic and Arabic language document collections.

Anthology ID:: 2019.icon-1.7
Volume:: Proceedings of the 16th International Conference on Natural Language Processing
Month:: December
Year:: 2019
Address:: International Institute of Information Technology, Hyderabad, India
Editors:: Dipti Misra Sharma, Pushpak Bhattacharya
Venue:: ICON
SIG:
Publisher:: NLP Association of India
Note:
Pages:: 56–64
Language:
URL:: https://aclanthology.org/2019.icon-1.7/
DOI:
Bibkey:
Cite (ACL):: Ibrahim Gashaw and H.l Shashirekha. 2019. Language Modelling with NMT Query Translation for Amharic-Arabic Cross-Language Information Retrieval. In Proceedings of the 16th International Conference on Natural Language Processing, pages 56–64, International Institute of Information Technology, Hyderabad, India. NLP Association of India.
Cite (Informal):: Language Modelling with NMT Query Translation for Amharic-Arabic Cross-Language Information Retrieval (Gashaw & Shashirekha, ICON 2019)
Copy Citation:
PDF:: https://aclanthology.org/2019.icon-1.7.pdf

PDF Cite Search Fix data