Dong Jing
2021
Saliency-based Multi-View Mixed Language Training for Zero-shot Cross-lingual Classification
Siyu Lai
|
Hui Huang
|
Dong Jing
|
Yufeng Chen
|
Jinan Xu
|
Jian Liu
Findings of the Association for Computational Linguistics: EMNLP 2021
Recent multilingual pre-trained models, like XLM-RoBERTa (XLM-R), have been demonstrated effective in many cross-lingual tasks. However, there are still gaps between the contextualized representations of similar words in different languages. To solve this problem, we propose a novel framework named Multi-View Mixed Language Training (MVMLT), which leverages code-switched data with multi-view learning to fine-tune XLM-R. MVMLT uses gradient-based saliency to extract keywords which are the most relevant to downstream tasks and replaces them with the corresponding words in the target language dynamically. Furthermore, MVMLT utilizes multi-view learning to encourage contextualized embeddings to align into a more refined language-invariant space. Extensive experiments with four languages show that our model achieves state-of-the-art results on zero-shot cross-lingual sentiment classification and dialogue state tracking tasks, demonstrating the effectiveness of our proposed model.