Xingnan Jin

Also published as: 醒男


2025

Large Language Models (LLMs) have shown remarkable capabilities in various tasks, including logical reasoning. However, their propensity for generating incorrect or inconsistent responses remains a significant concern. To address this issue, we propose FoVer (First-order logic Verification), an automated pipeline that verifies the logical correctness of reasoning texts using first-order logic. The pipeline operates in two main steps: (1) LLM-driven translation of natural language into executable logical expressions, and (2) automated logical verification using the Z3 theorem prover. We evaluate FoVer on specialized logical datasets (ProofWriter and FOLIO) and real-world LLM outputs (REVEAL). The results demonstrate that FoVer outperforms existing methods in logical verification significantly, showing notable improvements in accuracy across both ideal and practical scenarios. The pipeline also demonstrates its potential in identifying annotation errors in existing datasets, and it could be utilized for constructing new logical reasoning datasets. This work presents a significant step forward in enhancing the trustworthiness of LLM outputs, particularly in tasks requiring logical integrity.1

2024

“面向大语言模型提供自然语言指令,可生成预期输出,体现了其上下文学习能力。上下文学习的性能与上下文模板质量密切相关,现有的工作通常使用单一的选择算法进行模板构建,无法充分激发上下文学习能力。本文提出基于动态聚类与标签空间映射的上下文学习模板构建方法,动态选择相关示例,进一步提出聚类筛选方法,实现不同语义簇中示例多样化的选择。设计基于损失函数的排序选择方法,评估模板学习正确标签空间映射分布的能力,排序形成最终模板。在自然语言推理等任务中的实验结果表明,本文提出的方法使两个不同的大语言模型准确率最高分别提升3.2%和8.9%。”