Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension

Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, Daxin Jiang


Abstract
Multilingual pre-trained models could leverage the training data from a rich source language (such as English) to improve performance on low resource languages. However, the transfer quality for multilingual Machine Reading Comprehension (MRC) is significantly worse than sentence classification tasks mainly due to the requirement of MRC to detect the word level answer boundary. In this paper, we propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision: (1) A mixed MRC task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs; (2) A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web. Besides, extensive experiments on two cross-lingual MRC datasets show the effectiveness of our proposed approach.
Anthology ID:
2020.acl-main.87
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
925–934
Language:
URL:
https://aclanthology.org/2020.acl-main.87
DOI:
10.18653/v1/2020.acl-main.87
Bibkey:
Cite (ACL):
Fei Yuan, Linjun Shou, Xuanyu Bai, Ming Gong, Yaobo Liang, Nan Duan, Yan Fu, and Daxin Jiang. 2020. Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 925–934, Online. Association for Computational Linguistics.
Cite (Informal):
Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension (Yuan et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.87.pdf
Video:
 http://slideslive.com/38929253
Data
MLQASQuAD