It’s Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning

Nishant Balepur, Shramay Palta, Rachel Rudinger


Abstract
Chain-of-thought (COT) prompting can help large language models (LLMs) reason toward correct answers, but its efficacy in reasoning toward incorrect answers is unexplored. This process of elimination (PoE), when used with COT, can enhance self-consistency, interpretability, and tasks such as medical diagnoses of exclusion. Thus, we propose PoE with COT, where LLMs must reason toward incorrect options on multiple-choice questions. We evaluate the ability of GPT-3.5, LLaMA-2, and Falcon to perform PoE with COT on a total of four commonsense and scientific reasoning datasets. We find that the strategy of PoE always underperforms the strategy of choosing the correct answer. The agreement of these strategies is also lower than the self-consistency of each strategy. To study these issues further, we conduct error analyses and give suggestions for future work.
Anthology ID:
2024.findings-acl.604
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10143–10166
Language:
URL:
https://aclanthology.org/2024.findings-acl.604
DOI:
Bibkey:
Cite (ACL):
Nishant Balepur, Shramay Palta, and Rachel Rudinger. 2024. It’s Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning. In Findings of the Association for Computational Linguistics ACL 2024, pages 10143–10166, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
It’s Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning (Balepur et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.604.pdf