Wen Yao
2025
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
Shao Zhang
|
Xihuai Wang
|
Wenhao Zhang
|
Chaoran Li
|
Junru Song
|
Tingyu Li
|
Lin Qiu
|
Xuezhi Cao
|
Xunliang Cai
|
Wen Yao
|
Weinan Zhang
|
Xinbing Wang
|
Ying Wen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Agents built on large language models (LLMs) have excelled in turn-by-turn human-AI collaboration but struggle with simultaneous tasks requiring real-time interaction. Latency issues and the challenge of inferring variable human strategies hinder their ability to make autonomous decisions without explicit instructions. Through experiments with current independent *System 1* and *System 2* methods, we validate the necessity of using Dual Process Theory (DPT) in real-time tasks. We propose DPT-Agent, a novel language agent framework that integrates *System 1* and *System 2* for efficient real-time simultaneous human-AI collaboration. DPT-Agent’s *System 1* uses a Finite-state Machine (FSM) and code-as-policy for fast, intuitive, and controllable decision-making. DPT-Agent’s *System 2* integrates Theory of Mind (ToM) and asynchronous reflection to infer human intentions and perform reasoning-based autonomous decisions. We demonstrate the effectiveness of DPT-Agent through further experiments with rule-based agents and human collaborators, showing significant improvements over mainstream LLM-based frameworks. To the best of our knowledge, DPT-Agent is the first language agent framework that achieves successful real-time simultaneous human-AI collaboration autonomously. Code of DPT-Agent can be found in https://github.com/sjtu-marl/DPT-Agent.
SafeConf: A Confidence-Calibrated Safety Self-Evaluation Method for Large Language Models
Bo Zhang
|
Cong Gao
|
Linkang Yang
|
Bingxu Han
|
Minghao Hu
|
Zhunchen Luo
|
Guotong Geng
|
Xiaoying Bai
|
Jun Zhang
|
Wen Yao
|
Zhong Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Large language models (LLMs) have achieved groundbreaking progress in Natural Language Processing (NLP). Despite the numerous advantages of LLMs, they also pose significant safety risks. Self-evaluation mechanisms have gained increasing attention as a key safeguard to ensure safe and controllable content generation. However, LLMs often exhibit overconfidence, which seriously compromises the accuracy of safety self-evaluation. To address this challenge, we propose SafeConf, a method to enhance the safety self-evaluation capability of LLMs through confidence calibration. The method performs semantic mutations on the original safety evaluation questions and adopts a self-consistency strategy to quantify confidence based on answer accuracy on the mutated questions. Finally, these confidence scores are used to construct a dataset for fine-tuning. We conducte experiments on both Chinese and English datasets. The results show that SafeConf improves self-evaluation accuracy by an average of 5.86% and 7.79% over the state-of-the-art baseline methods on Qwen2.5-7B-Instruct and Llama3-8B-Instruct models, respectively, without affecting the general capabilities of the models.
Search
Fix author
Co-authors
- Xiaoying Bai 1
- Xunliang Cai 1
- Xuezhi Cao 1
- Cong Gao 1
- Guotong Geng 1
- show all...