Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts

Sumit Asthana, Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella Lapata


Abstract
One useful application of NLP models is to support people in reading complex text from unfamiliar domains (e.g., scientific articles). Simplifying the entire text makes it understandable but sometimes removes important details. On the contrary, helping adult readers understand difficult concepts in context can enhance their vocabulary and knowledge. In a preliminary human study, we first identify that lack of context and unfamiliarity with difficult concepts is a major reason for adult readers’ difficulty with domain-specific text. We then introduce targeted concept simplification, a simplification task for rewriting text to help readers comprehend text containing unfamiliar concepts. We also introduce WikiDomains, a new dataset of 22k definitions from 13 academic domains paired with a difficult concept within each definition. We benchmark the performance of open-source and commercial LLMs and a simple dictionary baseline on this task across human judgments of ease of understanding and meaning preservation. Interestingly, our human judges preferred explanations about the difficult concept more than simplifications of the concept phrase. Further, no single model achieved superior performance across all quality dimensions, and automated metrics also show low correlations with human evaluations of concept simplification (~0.2), opening up rich avenues for research on personalized human reading comprehension support.
Anthology ID:
2024.emnlp-main.357
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6208–6226
Language:
URL:
https://aclanthology.org/2024.emnlp-main.357
DOI:
Bibkey:
Cite (ACL):
Sumit Asthana, Hannah Rashkin, Elizabeth Clark, Fantine Huot, and Mirella Lapata. 2024. Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 6208–6226, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts (Asthana et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.357.pdf