Comparing Generic and Expert Models for Genre-Specific Text Simplification

Zihao Li, Matthew Shardlow, Fernando Alva-Manchego


Abstract
We investigate how text genre influences the performance of models for controlled text simplification. Regarding datasets from Wikipedia and PubMed as two different genres, we compare the performance of genre-specific models trained by transfer learning and prompt-only GPT-like large language models. Our experiments showed that: (1) the performance loss of genre-specific models on general tasks can be limited to 2%, (2) transfer learning can improve performance on genre-specific datasets up to 10% in SARI score from the base model without transfer learning, (3) simplifications generated by the smaller but more customized models show similar performance in simplicity and a better meaning reservation capability to the larger generic models in both automatic and human evaluations.
Anthology ID:
2023.tsar-1.6
Volume:
Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Sanja Štajner, Horacio Saggio, Matthew Shardlow, Fernando Alva-Manchego
Venues:
TSAR | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
51–67
Language:
URL:
https://aclanthology.org/2023.tsar-1.6
DOI:
Bibkey:
Cite (ACL):
Zihao Li, Matthew Shardlow, and Fernando Alva-Manchego. 2023. Comparing Generic and Expert Models for Genre-Specific Text Simplification. In Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability, pages 51–67, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Comparing Generic and Expert Models for Genre-Specific Text Simplification (Li et al., TSAR-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.tsar-1.6.pdf