Towards Unifying Multi-Lingual and Cross-Lingual Summarization

Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, Jie Zhou


Abstract
To adapt text summarization to the multilingual world, previous work proposes multi-lingual summarization (MLS) and cross-lingual summarization (CLS). However, these two tasks have been studied separately due to the different definitions, which limits the compatible and systematic research on both of them. In this paper, we aim to unify MLS and CLS into a more general setting, i.e., many-to-many summarization (M2MS), where a single model could process documents in any language and generate their summaries also in any language. As the first step towards M2MS, we conduct preliminary studies to show that M2MS can better transfer task knowledge across different languages than MLS and CLS. Furthermore, we propose Pisces, a pre-trained M2MS model that learns language modeling, cross-lingual ability and summarization ability via three-stage pre-training. Experimental results indicate that our Pisces significantly outperforms the state-of-the-art baselines, especially in the zero-shot directions, where there is no training data from the source-language documents to the target-language summaries.
Anthology ID:
2023.acl-long.843
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15127–15143
Language:
URL:
https://aclanthology.org/2023.acl-long.843
DOI:
10.18653/v1/2023.acl-long.843
Bibkey:
Cite (ACL):
Jiaan Wang, Fandong Meng, Duo Zheng, Yunlong Liang, Zhixu Li, Jianfeng Qu, and Jie Zhou. 2023. Towards Unifying Multi-Lingual and Cross-Lingual Summarization. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15127–15143, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Towards Unifying Multi-Lingual and Cross-Lingual Summarization (Wang et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.843.pdf
Video:
 https://aclanthology.org/2023.acl-long.843.mp4