E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models

Zhenyu Zhang, Bingguang Hao, Jinpeng Li, Zekai Zhang, Dongyan Zhao


Abstract
Modern large language models are sensitive to prompts, and another synonymous expression or a typo may lead to unexpected results for the model. Composing an optimal prompt for a specific demand lacks theoretical support and relies entirely on human experimentation, which poses a considerable obstacle to popularizing generative artificial intelligence. However, there is no systematic analysis of the stability of large language models to resist prompt perturbations. In this work, we propose to evaluate the ease-of-use of large language models and construct E-Bench, simulating the actual situation of human use from synonymous perturbation (including paraphrasing, simplification, and colloquialism) and typographical perturbation. Besides we also discuss the combination of these two types of perturbation and analyze the main reasons for performance degradation. Experimental results indicate that with the increase of model size, although the ease-of-use could be significantly improved, there is still a long way to go to build a sufficiently user-friendly model.
Anthology ID:
2025.coling-main.159
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2329–2339
Language:
URL:
https://aclanthology.org/2025.coling-main.159/
DOI:
Bibkey:
Cite (ACL):
Zhenyu Zhang, Bingguang Hao, Jinpeng Li, Zekai Zhang, and Dongyan Zhao. 2025. E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 2329–2339, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
E-Bench: Towards Evaluating the Ease-of-Use of Large Language Models (Zhang et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.159.pdf