Abstractive Summarization for the Ukrainian Language: Multi-Task Learning with Hromadske.ua News Dataset

Svitlana Galeshchuk

doi:10.18653/v1/2023.unlp-1.6

Abstractive Summarization for the Ukrainian Language: Multi-Task Learning with Hromadske.ua News Dataset

Abstract

Despite recent NLP developments, abstractive summarization remains a challenging task, especially in the case of low-resource languages like Ukrainian. The paper aims at improving the quality of summaries produced by mT5 for news in Ukrainian by fine-tuning the model with a mixture of summarization and text similarity tasks using summary-article and title-article training pairs, respectively. The proposed training set-up with small, base, and large mT5 models produce higher quality résumé. Besides, we present a new Ukrainian dataset for the abstractive summarization task that consists of circa 36.5K articles collected from Hromadske.ua until June 2021.

Anthology ID:: 2023.unlp-1.6
Volume:: Proceedings of the Second Ukrainian Natural Language Processing Workshop (UNLP)
Month:: May
Year:: 2023
Address:: Dubrovnik, Croatia
Editor:: Mariana Romanyshyn
Venue:: UNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 49–53
Language:
URL:: https://aclanthology.org/2023.unlp-1.6/
DOI:: 10.18653/v1/2023.unlp-1.6
Bibkey:
Cite (ACL):: Svitlana Galeshchuk. 2023. Abstractive Summarization for the Ukrainian Language: Multi-Task Learning with Hromadske.ua News Dataset. In Proceedings of the Second Ukrainian Natural Language Processing Workshop (UNLP), pages 49–53, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):: Abstractive Summarization for the Ukrainian Language: Multi-Task Learning with Hromadske.ua News Dataset (Galeshchuk, UNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.unlp-1.6.pdf
Video:: https://aclanthology.org/2023.unlp-1.6.mp4

PDF Cite Search Video Fix data