Linjuan Wu


2023

pdf bib
Good Meta-tasks Make A Better Cross-lingual Meta-transfer Learning for Low-resource Languages
Linjuan Wu | Zongyi Guo | Baoliang Cui | Haihong Tang | Weiming Lu
Findings of the Association for Computational Linguistics: EMNLP 2023

Model-agnostic meta-learning has garnered attention as a promising technique for enhancing few-shot cross-lingual transfer learning in low-resource scenarios. However, little attention was paid to the impact of data selection strategies on this cross-lingual meta-transfer method, particularly the sampling of cross-lingual meta-training data (i.e. meta-tasks) at the syntactic level to reduce language gaps. In this paper, we propose a Meta-Task Collector-based Cross-lingual Meta-Transfer framework (MeTaCo-XMT) to adapt different data selection strategies to construct meta-tasks for meta-transfer learning. Syntactic differences have an effect on transfer performance, so we consider a syntactic similarity sampling strategy and propose a syntactic distance metric model consisting of a syntactic encoder block based on the pre-trained model and a distance metric block using Word Move’s Distance (WMD). Additionally, we conduct experiments with three different data selection strategies to instantiate our framework and analyze their performance impact. Experimental results on two multilingual NLP datasets, Wikiann and TydiQA, demonstrate the significant superiority of our approach compared to existing strong baselines.

pdf bib
Struct-XLM: A Structure Discovery Multilingual Language Model for Enhancing Cross-lingual Transfer through Reinforcement Learning
Linjuan Wu | Weiming Lu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Cross-lingual transfer learning heavily relies on well-aligned cross-lingual representations. The syntactic structure is recognized as beneficial for cross-lingual transfer, but limited researches utilize it for aligning representation in multilingual pre-trained language models (PLMs). Additionally, existing methods require syntactic labels that are difficult to obtain and of poor quality for low-resource languages. To address this gap, we propose Struct-XLM, a novel multilingual language model that leverages reinforcement learning (RL) to autonomously discover universal syntactic structures for improving the cross-lingual representation alignment of PLM. Struct-XLM integrates a policy network (PNet) and a translation ranking task. The PNet is designed to discover structural information and integrate it into the last layer of the PLM through the structural multi-head attention module to obtain structural representation. The translation ranking task obtains a delayed reward based on the structural representation to optimize the PNet while improving the alignment of cross-lingual representation. Experiments show the effectiveness of the proposed approach for enhancing cross-lingual transfer of multilingual PLM on the XTREME benchmark.

2022

pdf bib
Learning Disentangled Semantic Representations for Zero-Shot Cross-Lingual Transfer in Multilingual Machine Reading Comprehension
Linjuan Wu | Shaojuan Wu | Xiaowang Zhang | Deyi Xiong | Shizhan Chen | Zhiqiang Zhuang | Zhiyong Feng
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Multilingual pre-trained models are able to zero-shot transfer knowledge from rich-resource to low-resource languages in machine reading comprehension (MRC). However, inherent linguistic discrepancies in different languages could make answer spans predicted by zero-shot transfer violate syntactic constraints of the target language. In this paper, we propose a novel multilingual MRC framework equipped with a Siamese Semantic Disentanglement Model (S2DM) to disassociate semantics from syntax in representations learned by multilingual pre-trained models. To explicitly transfer only semantic knowledge to the target language, we propose two groups of losses tailored for semantic and syntactic encoding and disentanglement. Experimental results on three multilingual MRC datasets (i.e., XQuAD, MLQA, and TyDi QA) demonstrate the effectiveness of our proposed approach over models based on mBERT and XLM-100.