Rethinking Machine Ethics – Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?

Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, Helen Meng


Abstract
Making moral judgments is an essential step toward developing ethical AI systems. Prevalent approaches are mostly implemented in a bottom-up manner, which uses a large set of annotated data to train models based on crowd-sourced opinions about morality. These approaches have been criticized for potentially overgeneralizing a limited group of annotators’ moral stances and lacking explainability. This work proposes a flexible top-down framework to steer (Large) Language Models to perform moral reasoning with well-established moral theories from interdisciplinary research. The theory-guided top-down framework can incorporate various moral theories. Our experiments demonstrate the effectiveness of the proposed framework on datasets derived from moral theories. Furthermore, we show the alignment between different moral theories and existing morality datasets. Our analysis exhibits the potential and flaws in existing resources (models and datasets) in developing explainable moral judgment-making systems.
Anthology ID:
2024.findings-naacl.144
Volume:
Findings of the Association for Computational Linguistics: NAACL 2024
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2227–2242
Language:
URL:
https://aclanthology.org/2024.findings-naacl.144
DOI:
Bibkey:
Cite (ACL):
Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, and Helen Meng. 2024. Rethinking Machine Ethics – Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 2227–2242, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Rethinking Machine Ethics – Can LLMs Perform Moral Reasoning through the Lens of Moral Theories? (Zhou et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-naacl.144.pdf
Copyright:
 2024.findings-naacl.144.copyright.pdf