SciGisPy: a Novel Metric for Biomedical Text Simplification via Gist Inference Score

Chen Lyu; Gabriele Pergola

doi:10.18653/v1/2024.tsar-1.10

SciGisPy: a Novel Metric for Biomedical Text Simplification via Gist Inference Score

Abstract

Biomedical literature is often written in highly specialized language, posing significant comprehension challenges for non-experts. Automatic text simplification (ATS) offers a solution by making such texts more accessible while preserving critical information. However, evaluating ATS for biomedical texts is still challenging due to the limitations of existing evaluation metrics. General-domain metrics like SARI, BLEU, and ROUGE focus on surface-level text features, and readability metrics like FKGL and ARI fail to account for domain-specific terminology or assess how well the simplified text conveys core meanings (gist). To address this, we introduce SciGisPy, a novel evaluation metric inspired by Gist Inference Score (GIS) from Fuzzy-Trace Theory (FTT). SciGisPy measures how well a simplified text facilitates the formation of abstract inferences (gist) necessary for comprehension, especially in the biomedical domain. We revise GIS for this purpose by introducing domain-specific enhancements, including semantic chunking, Information Content (IC) theory, and specialized embeddings, while removing unsuitable indexes. Our experimental evaluation on the Cochrane biomedical text simplification dataset demonstrates that SciGisPy outperforms the original GIS formulation, with a significant increase in correctly identified simplified texts (84% versus 44.8%). The results and a thorough ablation study confirm that SciGisPy better captures the essential meaning of biomedical content, outperforming existing approaches.

Anthology ID:: 2024.tsar-1.10
Volume:: Proceedings of the Third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024)
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Matthew Shardlow, Horacio Saggion, Fernando Alva-Manchego, Marcos Zampieri, Kai North, Sanja Štajner, Regina Stodden
Venues:: TSAR | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 95–106
Language:
URL:: https://aclanthology.org/2024.tsar-1.10/
DOI:: 10.18653/v1/2024.tsar-1.10
Bibkey:
Cite (ACL):: Chen Lyu and Gabriele Pergola. 2024. SciGisPy: a Novel Metric for Biomedical Text Simplification via Gist Inference Score. In Proceedings of the Third Workshop on Text Simplification, Accessibility and Readability (TSAR 2024), pages 95–106, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: SciGisPy: a Novel Metric for Biomedical Text Simplification via Gist Inference Score (Lyu & Pergola, TSAR 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.tsar-1.10.pdf

PDF Cite Search Fix data