PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning

Zhihan Zhang; Dong-Ho Lee; Yuwei Fang; Wenhao Yu; Mengzhao Jia; Meng Jiang; Francesco Barbieri

doi:10.18653/v1/2024.acl-long.379

PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning

Zhihan Zhang, Dong-Ho Lee, Yuwei Fang, Wenhao Yu, Mengzhao Jia, Meng Jiang, Francesco Barbieri

Abstract

Instruction tuning has remarkably advanced large language models (LLMs) in understanding and responding to diverse human instructions. Despite the success in high-resource languages, its application in lower-resource ones faces challenges due to the imbalanced foundational abilities of LLMs across different languages, stemming from the uneven language distribution in their pre-training data. To tackle this issue, we propose pivot language guided generation (PLUG), an approach that utilizes a high-resource language, primarily English, as the pivot to enhance instruction tuning in lower-resource languages. It trains the model to first process instructions in the pivot language, and then produce responses in the target language. To evaluate our approach, we introduce a benchmark, X-AlpacaEval, of instructions in 4 languages (Chinese, Korean, Italian, and Spanish), each annotated by professional translators. Our approach demonstrates a significant improvement in the instruction-following abilities of LLMs by 29% on average, compared to directly responding in the target language alone. Further experiments validate the versatility of our approach by employing alternative pivot languages beyond English to assist languages where LLMs exhibit lower proficiency. Code and data are available at https://github.com/ytyz1307zzh/PLUG.

Anthology ID:: 2024.luhme-long.379
Volume:: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7025–7046
Language:
URL:: https://aclanthology.org/2024.luhme-long.379/
DOI:: 10.18653/v1/2024.acl-long.379
Bibkey:
Cite (ACL):: Zhihan Zhang, Dong-Ho Lee, Yuwei Fang, Wenhao Yu, Mengzhao Jia, Meng Jiang, and Francesco Barbieri. 2024. PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7025–7046, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning (Zhang et al., ACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.acl-long.379.pdf

PDF Cite Search Fix data