Translation-Augmented Multilingual Summarization for Low-Resource Languages

Prasanth

Translation-Augmented Multilingual Summarization for Low-Resource Languages

Abstract

While automatic text summarization has achieved remarkable success in English,extending these capabilities to low-resource languages remains a significantchallenge due to the scarcity of labeled training data. We propose atranslation-augmented approach to multilingual summarization: we systematicallytranslate high-quality English summarization corpora into low-resource targetlanguages using NLLB-200, and use the resulting parallel data to train andevaluate sequence-to-sequence models. We experiment across three typologicallydiverse languages—Swahili, Hausa, and Afrikaans—comparing monolingualfine-tuning (MONO), cross-lingual transfer (XLT), and joint multilingualtraining (TAMT) on mBART-large-50. Monolingual fine-tuning achieves the bestperformance for Swahili (ROUGE-L 13.9) and Afrikaans (ROUGE-L 15.7),surpassing the Lead-3 baseline in both cases, while cross-lingual transferremains strongest for Hausa (ROUGE-L 14.5). We show that native language tokenavailability in mBART-50 is a critical determinant of fine-tuning performance,and characterize the conditions under which the theoretically expectedTAMT > MONO > XLT ordering breaks down. We release our dataset, code, andevaluation infrastructure to support future research on low-resourcemultilingual summarization.

Anthology ID:: 2026.ltedi-1.10
Volume:: Proceedings of the Sixth Workshop on Language Technology for Equality, Diversity, Inclusion
Month:: July
Year:: 2026
Address:: Virtual (Online)
Editors:: Bharathi Raja Chakravarthi, Bharathi B, Paul Buitelaar, Durairaj Thenmozhi, Miguel Ángel García Cumbreras, Salud María Jiménez Zafra
Venues:: LTEDI | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 108–117
Language:
URL:: https://aclanthology.org/2026.ltedi-1.10/
DOI:
Bibkey:
Cite (ACL):: Prasanth. 2026. Translation-Augmented Multilingual Summarization for Low-Resource Languages. In Proceedings of the Sixth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 108–117, Virtual (Online). Association for Computational Linguistics.
Cite (Informal):: Translation-Augmented Multilingual Summarization for Low-Resource Languages (Prasanth, LTEDI 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.ltedi-1.10.pdf

PDF Cite Search Fix data