Robust Machine Reading Comprehension by Learning Soft labels
Zhenyu Zhao | Shuangzhi Wu | Muyun Yang | Kehai Chen | Tiejun Zhao
Proceedings of the 28th International Conference on Computational Linguistics
Neural models have achieved great success on the task of machine reading comprehension (MRC), which are typically trained on hard labels. We argue that hard labels limit the model capability on generalization due to the label sparseness problem. In this paper, we propose a robust training method for MRC models to address this problem. Our method consists of three strategies, 1) label smoothing, 2) word overlapping, 3) distribution prediction. All of them help to train models on soft labels. We validate our approach on the representative architecture - ALBERT. Experimental results show that our method can greatly boost the baseline with 1% improvement in average, and achieve state-of-the-art performance on NewsQA and QUOREF.