Run Peng
2024
Shall We Team Up: Exploring Spontaneous Cooperation of Competing LLM Agents
Zengqing Wu
|
Run Peng
|
Shuyuan Zheng
|
Qianying Liu
|
Xu Han
|
Brian Kwon
|
Makoto Onizuka
|
Shaojie Tang
|
Chuan Xiao
Findings of the Association for Computational Linguistics: EMNLP 2024
Large Language Models (LLMs) have increasingly been utilized in social simulations, where they are often guided by carefully crafted instructions to stably exhibit human-like behaviors during simulations. Nevertheless, we doubt the necessity of shaping agents’ behaviors for accurate social simulations. Instead, this paper emphasizes the importance of spontaneous phenomena, wherein agents deeply engage in contexts and make adaptive decisions without explicit directions. We explored spontaneous cooperation across three competitive scenarios and successfully simulated the gradual emergence of cooperation, findings that align closely with human behavioral data. This approach not only aids the computational social science community in bridging the gap between simulations and real-world dynamics but also offers the AI community a novel method to assess LLMs’ capability of deliberate reasoning.Our source code is available at https://github.com/wuzengqing001225/SABM_ShallWeTeamUp
2023
Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models
Ziqiao Ma
|
Jacob Sansom
|
Run Peng
|
Joyce Chai
Findings of the Association for Computational Linguistics: EMNLP 2023
Large Language Models (LLMs) have generated considerable interest and debate regarding their potential emergence of Theory of Mind (ToM). Several recent inquiries reveal a lack of robust ToM in these models and pose a pressing demand to develop new benchmarks, as current ones primarily focus on different aspects of ToM and are prone to shortcuts and data leakage. In this position paper, we seek to answer two road-blocking questions: (1) How can we taxonomize a holistic landscape of machine ToM? (2) What is a more effective evaluation protocol for machine ToM? Following psychological studies, we taxonomize machine ToM into 7 mental state categories and delineate existing benchmarks to identify under-explored aspects of ToM. We argue for a holistic and situated evaluation of ToM to break ToM into individual components and treat LLMs as an agent who is physically situated in environments and socially situated in interactions with humans. Such situated evaluation provides a more comprehensive assessment of mental states and potentially mitigates the risk of shortcuts and data leakage. We further present a pilot study in a grid world setup as a proof of concept. We hope this position paper can facilitate future research to integrate ToM with LLMs and offer an intuitive means for researchers to better position their work in the landscape of ToM.
Search
Co-authors
- Ziqiao Ma 1
- Jacob Sansom 1
- Joyce Chai 1
- Zengqing Wu 1
- Shuyuan Zheng 1
- show all...