Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement

Xin Quan, Marco Valentino, Louise Dennis, Andre Freitas


Abstract
An increasing amount of research in Natural Language Inference (NLI) focuses on the application and evaluation of Large Language Models (LLMs) and their reasoning capabilities. Despite their success, however, LLMs are still prone to factual errors and inconsistencies in their explanations, offering limited control and interpretability for inference in complex domains. In this paper, we focus on ethical NLI, investigating how hybrid neuro-symbolic techniques can enhance the logical validity and alignment of ethical explanations produced by LLMs. Specifically, we present an abductive-deductive framework named Logic-Explainer, which integrates LLMs with an external backward-chaining solver to refine step-wise natural language explanations and jointly verify their correctness, reduce incompleteness and minimise redundancy. An extensive empirical analysis demonstrates that Logic-Explainer can improve explanations generated via in-context learning methods and Chain-of-Thought (CoT) on challenging ethical NLI tasks, while, at the same time, producing formal proofs describing and supporting models’ reasoning. As ethical NLI requires commonsense reasoning to identify underlying moral violations, our results suggest the effectiveness of neuro-symbolic methods for multi-step NLI more broadly, opening new opportunities to enhance the logical consistency, reliability, and alignment of LLMs.
Anthology ID:
2024.eacl-long.1
Volume:
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–22
Language:
URL:
https://aclanthology.org/2024.eacl-long.1
DOI:
Bibkey:
Cite (ACL):
Xin Quan, Marco Valentino, Louise Dennis, and Andre Freitas. 2024. Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1–22, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement (Quan et al., EACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.eacl-long.1.pdf