CL-ReLKT: Cross-lingual Language Knowledge Transfer for Multilingual Retrieval Question Answering

Peerat Limkonchotiwat, Wuttikorn Ponwitayarat, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, Sarana Nutanong


Abstract
Cross-Lingual Retrieval Question Answering (CL-ReQA) is concerned with retrieving answer documents or passages to a question written in a different language. A common approach to CL-ReQA is to create a multilingual sentence embedding space such that question-answer pairs across different languages are close to each other. In this paper, we propose a novel CL-ReQA method utilizing the concept of language knowledge transfer and a new cross-lingual consistency training technique to create a multilingual embedding space for ReQA. To assess the effectiveness of our work, we conducted comprehensive experiments on CL-ReQA and a downstream task, machine reading QA. We compared our proposed method with the current state-of-the-art solutions across three public CL-ReQA corpora. Our method outperforms competitors in 19 out of 21 settings of CL-ReQA. When used with a downstream machine reading QA task, our method outperforms the best existing language-model-based method by 10% in F1 while being 10 times faster in sentence embedding computation. The code and models are available at https://github.com/mrpeerat/CL-ReLKT.
Anthology ID:
2022.findings-naacl.165
Volume:
Findings of the Association for Computational Linguistics: NAACL 2022
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2141–2155
Language:
URL:
https://aclanthology.org/2022.findings-naacl.165
DOI:
10.18653/v1/2022.findings-naacl.165
Bibkey:
Cite (ACL):
Peerat Limkonchotiwat, Wuttikorn Ponwitayarat, Can Udomcharoenchaikit, Ekapol Chuangsuwanich, and Sarana Nutanong. 2022. CL-ReLKT: Cross-lingual Language Knowledge Transfer for Multilingual Retrieval Question Answering. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2141–2155, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
CL-ReLKT: Cross-lingual Language Knowledge Transfer for Multilingual Retrieval Question Answering (Limkonchotiwat et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-naacl.165.pdf
Video:
 https://aclanthology.org/2022.findings-naacl.165.mp4
Code
 mrpeerat/cl-relkt
Data
MLQASQuADXQuAD