Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models

Gyutae Park, Seojin Hwang, Hwanhee Lee


Abstract
Cross-lingual summarization (XLS) aims to generate a summary in a target language different from the source language document. While large language models (LLMs) have shown promising zero-shot XLS performance, their few-shot capabilities on this task remain unexplored, especially for low-resource languages with limited parallel data. In this paper, we investigate the few-shot XLS performance of various models, including Mistral-7B-Instruct-v0.2, GPT-3.5, and GPT-4.Our experiments demonstrate that few-shot learning significantly improves the XLS performance of LLMs, particularly GPT-3.5 and GPT-4, in low-resource settings. However, the open-source model Mistral-7B-Instruct-v0.2 struggles to adapt effectively to the XLS task with limited examples. Our findings highlight the potential of few-shot learning for improving XLS performance and the need for further research in designing LLM architectures and pre-training objectives tailored for this task. We provide a future work direction to explore more effective few-shot learning strategies and to investigate the transfer learning capabilities of LLMs for cross-lingual summarization.
Anthology ID:
2024.loresmt-1.6
Volume:
Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jade Abbott, Jonathan Washington, Nathaniel Oco, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
Venues:
LoResMT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
57–63
Language:
URL:
https://aclanthology.org/2024.loresmt-1.6
DOI:
10.18653/v1/2024.loresmt-1.6
Bibkey:
Cite (ACL):
Gyutae Park, Seojin Hwang, and Hwanhee Lee. 2024. Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models. In Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024), pages 57–63, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Low-Resource Cross-Lingual Summarization through Few-Shot Learning with Large Language Models (Park et al., LoResMT-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.loresmt-1.6.pdf