Contemporary LLMs and Literary Abridgement: An Analytical Inquiry

Iglika Nikolova-Stoupak; Gaël Lejeune; Eva Schaeffer-Lacroix

Contemporary LLMs and Literary Abridgement: An Analytical Inquiry

Iglika Nikolova-Stoupak, Gaël Lejeune, Eva Schaeffer-Lacroix

Abstract

Within the framework of this study, several contemporary Large Language Models (ChatGPT, Gemini Pro, Mistral-Instruct and BgGPT) are evaluated in relation to their ability to generate abridged versions of literary texts. The analysis is based on ’The Ugly Duckling’ by H. C. Andersen as translated into English, French and Bulgarian. The different scenarios of abridgement experimented with include zero-shot, one-shot, division into chunks and crosslingual (including chain-of-thought) abridgement. The resulting texts are evaluated both automatically and via human evaluation. The automatic analysis includes ROUGE and BERTScore as well as the ratios of a selection of readability-related textual features (e.g. number of words, type-to-token ratio) as pertaining to the original versus automatically abridged texts. Professionally composed abridged versions are regarded as gold standard. Following the automatic analysis, six selected best candidate texts per language are then evaluated by volunteers with university education in terms of textual characteristics of a more qualitative nature, such as coherence, consistency and aesthetic appeal.

Anthology ID:: 2024.clib-1.4
Volume:: Proceedings of the Sixth International Conference on Computational Linguistics in Bulgaria (CLIB 2024)
Month:: September
Year:: 2024
Address:: Sofia, Bulgaria
Venue:: CLIB
SIG:
Publisher:: Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences
Note:
Pages:: 39–57
Language:
URL:: https://aclanthology.org/2024.clib-1.4/
DOI:
Bibkey:
Cite (ACL):: Iglika Nikolova-Stoupak, Gaël Lejeune, and Eva Schaeffer-Lacroix. 2024. Contemporary LLMs and Literary Abridgement: An Analytical Inquiry. In Proceedings of the Sixth International Conference on Computational Linguistics in Bulgaria (CLIB 2024), pages 39–57, Sofia, Bulgaria. Department of Computational Linguistics, Institute for Bulgarian Language, Bulgarian Academy of Sciences.
Cite (Informal):: Contemporary LLMs and Literary Abridgement: An Analytical Inquiry (Nikolova-Stoupak et al., CLIB 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.clib-1.4.pdf

PDF Cite Search Fix data