Multi-Task Learning for Cross-Lingual Abstractive Summarization

Sho Takase, Naoaki Okazaki


Abstract
We present a multi-task learning framework for cross-lingual abstractive summarization to augment training data. Recent studies constructed pseudo cross-lingual abstractive summarization data to train their neural encoder-decoders. Meanwhile, we introduce existing genuine data such as translation pairs and monolingual abstractive summarization data into training. Our proposed method, Transum, attaches a special token to the beginning of the input sentence to indicate the target task. The special token enables us to incorporate the genuine data into the training data easily. The experimental results show that Transum achieves better performance than the model trained with only pseudo cross-lingual summarization data. In addition, we achieve the top ROUGE score on Chinese-English and Arabic-English abstractive summarization. Moreover, Transum also has a positive effect on machine translation. Experimental results indicate that Transum improves the performance from the strong baseline, Transformer, in Chinese-English, Arabic-English, and English-Japanese translation datasets.
Anthology ID:
2022.lrec-1.322
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3008–3016
Language:
URL:
https://aclanthology.org/2022.lrec-1.322
DOI:
Bibkey:
Cite (ACL):
Sho Takase and Naoaki Okazaki. 2022. Multi-Task Learning for Cross-Lingual Abstractive Summarization. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3008–3016, Marseille, France. European Language Resources Association.
Cite (Informal):
Multi-Task Learning for Cross-Lingual Abstractive Summarization (Takase & Okazaki, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.322.pdf