A Multilingual Reading Comprehension System for more than 100 Languages

Anthony Ferritto, Sara Rosenthal, Mihaela Bornea, Kazi Hasan, Rishav Chakravarti, Salim Roukos, Radu Florian, Avi Sil


Abstract
This paper presents M-GAAMA, a Multilingual Question Answering architecture and demo system. This is the first multilingual machine reading comprehension (MRC) demo which is able to answer questions in over 100 languages. M-GAAMA answers questions from a given passage in the same or different language. It incorporates several existing multilingual models that can be used interchangeably in the demo such as M-BERT and XLM-R. The M-GAAMA demo also improves language accessibility by incorporating the IBM Watson machine translation widget to provide additional capabilities to the user to see an answer in their desired language. We also show how M-GAAMA can be used in downstream tasks by incorporating it into an END-TO-END-QA system using CFO (Chakravarti et al., 2019). We experiment with our system architecture on the Multi-Lingual Question Answering (MLQA) and the COVID-19 CORD (Wang et al., 2020; Tang et al., 2020) datasets to provide insights into the performance of the system.
Anthology ID:
2020.coling-demos.8
Volume:
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Michal Ptaszynski, Bartosz Ziolko
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics (ICCL)
Note:
Pages:
41–47
Language:
URL:
https://aclanthology.org/2020.coling-demos.8
DOI:
10.18653/v1/2020.coling-demos.8
Bibkey:
Cite (ACL):
Anthony Ferritto, Sara Rosenthal, Mihaela Bornea, Kazi Hasan, Rishav Chakravarti, Salim Roukos, Radu Florian, and Avi Sil. 2020. A Multilingual Reading Comprehension System for more than 100 Languages. In Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations, pages 41–47, Barcelona, Spain (Online). International Committee on Computational Linguistics (ICCL).
Cite (Informal):
A Multilingual Reading Comprehension System for more than 100 Languages (Ferritto et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-demos.8.pdf
Data
CORD-19Natural QuestionsSQuADTyDiQA