SEARCHER: Shared Embedding Architecture for Effective Retrieval

Joel Barry, Elizabeth Boschee, Marjorie Freedman, Scott Miller


Abstract
We describe an approach to cross lingual information retrieval that does not rely on explicit translation of either document or query terms. Instead, both queries and documents are mapped into a shared embedding space where retrieval is performed. We discuss potential advantages of the approach in handling polysemy and synonymy. We present a method for training the model, and give details of the model implementation. We present experimental results for two cases: Somali-English and Bulgarian-English CLIR.
Anthology ID:
2020.clssts-1.4
Volume:
Proceedings of the workshop on Cross-Language Search and Summarization of Text and Speech (CLSSTS2020)
Month:
May
Year:
2020
Address:
Marseille, France
Venues:
CLSSTS | LREC | WS
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
22–25
Language:
English
URL:
https://aclanthology.org/2020.clssts-1.4
DOI:
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2020.clssts-1.4.pdf