Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs

Parsa Hejabi; Elnaz Rahmati; Alireza Salkhordeh Ziabari; Morteza Dehghani

Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs

Parsa Hejabi, Elnaz Rahmati, Alireza Salkhordeh Ziabari, Morteza Dehghani

Abstract

Large Language Models (LLMs) often produce inconsistent answers when faced with different phrasings of the same prompt. In this paper, we propose Flip-Flop Consistency (F²C), an unsupervised training method that improves robustness to such perturbations. F²C is composed of two key components. The first, Consensus Cross-Entropy (CCE), uses a majority vote across prompt variations to create a hard pseudo-label. The second is a representation alignment loss that pulls lower-confidence and non-majority predictors toward the consensus established by high-confidence, majority-voting variations. We evaluate our method on 11 datasets spanning four NLP tasks, with 4–15 prompt variations per dataset. On average, F²C raises observed agreement by 11.62%, improves mean F₁ by 8.94%, and reduces performance variance across formats by 3.29%. In out-of-domain evaluations, F²C generalizes effectively, increasing ̅F₁ and agreement while decreasing variance across most source-target pairs. Finally, when trained on only a subset of prompt perturbations and evaluated on held-out formats, F²C consistently improves both performance and agreement while reducing variance. These findings highlight F²C as an effective unsupervised method for enhancing LLM consistency, performance, and generalization under prompt perturbations.

Anthology ID:: 2026.acl-long.71
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1571–1587
Language:
URL:: https://aclanthology.org/2026.acl-long.71/
DOI:
Bibkey:
Cite (ACL):: Parsa Hejabi, Elnaz Rahmati, Alireza Salkhordeh Ziabari, and Morteza Dehghani. 2026. Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1571–1587, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Flip-Flop Consistency: Unsupervised Training for Robustness to Prompt Perturbations in LLMs (Hejabi et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.71.pdf
Checklist:: 2026.acl-long.71.checklist.pdf

PDF Cite Search Checklist Fix data