Shuo Chen
2024
Boosting LLM Agents with Recursive Contemplation for Effective Deception Handling
Shenzhi Wang
|
Chang Liu
|
Zilong Zheng
|
Siyuan Qi
|
Shuo Chen
|
Qisen Yang
|
Andrew Zhao
|
Chaofei Wang
|
Shiji Song
|
Gao Huang
Findings of the Association for Computational Linguistics: ACL 2024
Recent advances in large language models (LLMs) have led to significant success in using LLMs as agents. Nevertheless, a common assumption that LLMs always process honest information neglects the widespread deceptive or misleading content in human and AI-generated material. This oversight might expose LLMs to malicious manipulations. To enhance LLMs’ ability to identify and counteract deceptive information, in this paper, inspired by humans’ recursive thinking and perspective-taking, we introduce a novel cognitive framework, Recursive Contemplation (ReCon). ReCon combines formulation and refinement contemplation processes; formulation contemplation produces initial thoughts and speech, while refinement contemplation further polishes them. Additionally, we incorporate first-order and second-order perspective transitions into these processes respectively. Specifically, the first-order allows an LLM agent to infer others’ mental states, and the second-order involves understanding how others perceive the agent’s mental state. After integrating ReCon with various LLMs, extensive experiment results from the Avalon game and BigTom benchmark indicate ReCon’s efficacy in aiding LLMs to discern and maneuver around deceptive information without extra fine-tuning and data. Finally, we demonstrate ReCon’s scaling trend with model parameters, and explore the current limitations of LLMs in terms of safety and reasoning, potentially furnishing insights for subsequent research. Our project page can be found at https://shenzhi-wang.github.io/avalon_recon.
Visual Question Decomposition on Multimodal Large Language Models
Haowei Zhang
|
Jianzhe Liu
|
Zhen Han
|
Shuo Chen
|
Bailan He
|
Volker Tresp
|
Zhiqiang Xu
|
Jindong Gu
Findings of the Association for Computational Linguistics: EMNLP 2024
Question decomposition has emerged as an effective strategy for prompting Large Language Models (LLMs) to answer complex questions. However, while existing methods primarily focus on unimodal language models, the question decomposition capability of Multimodal Large Language Models (MLLMs) has yet to be explored. To this end, this paper explores visual question decomposition on MLLMs. Specifically, we introduce a systematic evaluation framework including a dataset and several evaluation criteria to assess the quality of the decomposed sub-questions, revealing that existing MLLMs struggle to produce high-quality sub-questions. To address this limitation, we propose a specific finetuning dataset, DecoVQA+, for enhancing the model’s question decomposition capability. Aiming at enabling models to perform appropriate selective decomposition, we propose an efficient finetuning pipeline. The finetuning pipeline consists of our proposed dataset and a training objective for selective decomposition. Finetuned MLLMs demonstrate significant improvements in the quality of sub-questions and the policy of selective question decomposition. Additionally, the models also achieve higher accuracy with selective decomposition on VQA benchmark datasets.
Search
Co-authors
- Shenzhi Wang 1
- Chang Liu 1
- Zilong Zheng 1
- Siyuan Qi 1
- Qisen Yang 1
- show all...