Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions

Jiseon Kim; Jea Kwon; Luiz Felipe Vecchietti; Wenchao Dong; Jaehong Kim; Meeyoung Cha

Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions

Jiseon Kim, Jea Kwon, Luiz Felipe Vecchietti, Wenchao Dong, Jaehong Kim, Meeyoung Cha

Abstract

Human moral judgment is context-dependent and changes based on interpersonal relationships. As large language models (LLMs) increasingly serve as decision-support systems, it is critical to understand if they encode these social nuances. We characterize LLM behavior using the Whistleblower’s Dilemma, systematically varying two experimental factors: crime severity and relational closeness. Our study compares three evaluative perspectives: (1) moral rightness (general prescriptive norms), (2) predictive human behavior (how models expect people to navigate social situations), and (3) models’ own decision-making. By analyzing the reasoning processes, we find a clear cross-perspective divergence: moral rightness remains consistently fairness-oriented, while predicted human behavior shifts with relational context toward loyalty. Crucially, the model decisions mirror moral rightness judgments, rather than their behavioral predictions. This cross-perspective inconsistency suggests that LLM decision-making favors abstract rules over the social sensitivity found in their internal modeling, potentially producing conflicting expectations in real-world deployments.

Anthology ID:: 2026.findings-acl.1547
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 30938–30955
Language:
URL:: https://aclanthology.org/2026.findings-acl.1547/
DOI:
Bibkey:
Cite (ACL):: Jiseon Kim, Jea Kwon, Luiz Felipe Vecchietti, Wenchao Dong, Jaehong Kim, and Meeyoung Cha. 2026. Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions. In Findings of the Association for Computational Linguistics: ACL 2026, pages 30938–30955, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Machine Behavior in Relational Moral Dilemmas: Moral Rightness, Predicted Human Behavior, and Model Decisions (Kim et al., Findings 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.findings-acl.1547.pdf
Checklist:: 2026.findings-acl.1547.checklist.pdf

PDF Cite Search Checklist Fix data