From Wordle to Fibble5: Evaluating LLM Reasoning Under Escalating Deception

Chang Liu

From Wordle to Fibble⁵: Evaluating LLM Reasoning Under Escalating Deception

Abstract

Standard benchmarks for large language models (LLMs) assume that task feedback is truthful, but real-world reasoning often requires processing unreliable or adversarial information. We introduce WordleArenas, a benchmark platform that evaluates LLM reasoning robustness across a deception gradient. Building on Wordle and its deceptive variant Fibble (Chusap et al., 2025), we generalize to Fibblek (k = 0, . . . , 5 lies per row), creating a controlled evaluation of LLM robustness to misinformation. Across six arenas — standard Wordle (0 lies per row) through Fibble5 (5 lies per row) — we evaluate 41 models from 10 providers across 3,749 games. We find that (1) even one lie per row causes catastrophic performance drops (average win rate falls from 41.1% to 18.7%), (2) a sharp deception cliff emerges at 2–3 lies where nearly all models collapse to ≤3% win rate, and (3) model robustness to deception is poorly predicted by standard benchmark rankings. A surprising Fibble5 recovery emerges: some models recover partial performance when all feedback lies (average 9.5%), outperforming Fibble3 (0.3%) and Fibble4 (0.4%), because knowing that every tile lies restores deterministic — though partial — information. Our results demonstrate that truthful-feedback evaluations systematically overestimate LLM reasoning capabilities and that deception-aware benchmarks are essential for assessing real-world robustness. All code and data are publicly available.

Anthology ID:: 2026.evaleval-1.5
Volume:: Proceedings of the Workshop on Evaluating Evaluations (EvalEval)
Month:: July
Year:: 2026
Address:: San Diego, CA
Editors:: Mubashara Akhtar, Jan Batzner, Leshem Choshen, Avijit Ghosh, Usman Gohar, Jennifer Mickel, Ichhya Pant, Zeerak Talat, Michelle Lin
Venues:: EvalEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 36–45
Language:
URL:: https://aclanthology.org/2026.evaleval-1.5/
DOI:
Bibkey:
Cite (ACL):: Chang Liu. 2026. From Wordle to Fibble5: Evaluating LLM Reasoning Under Escalating Deception. In Proceedings of the Workshop on Evaluating Evaluations (EvalEval), pages 36–45, San Diego, CA. Association for Computational Linguistics.
Cite (Informal):: From Wordle to Fibble5: Evaluating LLM Reasoning Under Escalating Deception (Liu, EvalEval 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.evaleval-1.5.pdf

PDF Cite Search Fix data