Meizhen Liu
2023
InteMATs: Integrating Granularity-Specific Multilingual Adapters for Cross-Lingual Transfer
Meizhen Liu
|
Xu Guo
|
He Jiakai
|
Jianye Chen
|
Fengyu Zhou
|
Siu Hui
Findings of the Association for Computational Linguistics: EMNLP 2023
Multilingual language models (MLLMs) have achieved remarkable success in various cross-lingual transfer tasks. However, they suffer poor performance in zero-shot low-resource languages, particularly when dealing with longer contexts. Existing research mainly relies on full-model fine-tuning on large parallel datasets to enhance the cross-lingual alignment of MLLMs, which is computationally expensive. In this paper, we propose InteMATs, a novel approach that integrates multilingual adapters trained on texts of different levels of granularity. To achieve this, we curate a multilingual parallel dataset comprising 42 languages to pre-train sentence-level and document-level adapters under the contrastive learning framework. Extensive experiments demonstrate the effectiveness of InteMATs in improving the cross-lingual transfer performance of MLLMs, especially on low-resource languages. Finally, our comprehensive analyses and ablation studies provide a deep understanding of the high-quality representations derived by InteMATs.
2022
Adapters for Enhanced Modeling of Multilingual Knowledge and Text
Yifan Hou
|
Wenxiang Jiao
|
Meizhen Liu
|
Carl Allen
|
Zhaopeng Tu
|
Mrinmaya Sachan
Findings of the Association for Computational Linguistics: EMNLP 2022
Large language models appear to learn facts from the large text corpora they are trained on. Such facts are encoded implicitly within their many parameters, making it difficult to verify or manipulate what knowledge has been learned. Language models have recently been extended to multilingual language models (MLLMs), enabling knowledge to be learned across hundreds of languages. Meanwhile, knowledge graphs contain facts in an explicit triple format, which require careful and costly curation and are only available in a few high-resource languages, restricting their research and application. To address these issues, we propose to enhance MLLMs with knowledge from multilingual knowledge graphs (MLKGs) so as to tackle language and knowledge graph tasks across many languages, including low-resource ones. Specifically, we introducea lightweight adapter set to enhance MLLMs with cross-lingual entity alignment and facts from MLKGs for many languages. Experiments on common benchmarks show that such enhancement benefits both MLLMs and MLKGs, achieving: (1) comparable or improved performance for knowledge graph completion and entity alignment relative to baselines, especially for low-resource languages (for which knowledge graphs are unavailable); and (2) improved MLLM performance on language understanding tasks that require multilingual factual knowledge; all while maintaining performance on other general language tasks.
Search
Fix data
Co-authors
- Carl Allen 1
- Jianye Chen 1
- Xu Guo (郭旭) 1
- Yifan Hou 1
- Siu Hui 1
- show all...