UNSC-Bench: Evaluating LLM Diplomatic Role-Playing Through UN Security Council Vote Prediction

Ayush Nangia; Aman Gokrani; Ruggero Marino Lazzaroni

UNSC-Bench: Evaluating LLM Diplomatic Role-Playing Through UN Security Council Vote Prediction

Ayush Nangia, Aman Gokrani, Ruggero Marino Lazzaroni

Abstract

This paper introduces UNSC-Bench, a benchmark for evaluating Large Language Models (LLMs) in simulating diplomatic decision-making through United Nations Security Council (UNSC) vote prediction. The dataset includes 469 UNSC resolutions from 1947 to 2025, with voting records for the five permanent members (P5) (United States, China, France, Russia, United Kingdom) and translations in four languages. We analyze 26 LLMs, along with thinking variants, across multiple P5 roles and find that (1) without explicit role assignment, models are diplomatically unaligned, defaulting to high yes rates and failing to match any P5 voting pattern, indicating they lack inherent diplomatic identity; (2) model capability (as measured by MMLU-Pro) is strongly correlated with role-playing accuracy; (3) regional models do not outperform others in predicting their home country’s votes; and (4) multilingual evaluation reveals that prompt language impacts model predictions, particularly for minority vote outcomes.

Anthology ID:: 2026.mme-main.10
Volume:: Proceedings of the First Workshop on Multilingual Multicultural Evaluation
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Pinzhen Chen, Vilém Zouhar, Hanxu Hu, Simran Khanuja, Wenhao Zhu, Barry Haddow, Alexandra Birch, Alham Fikri Aji, Rico Sennrich, Sara Hooker
Venues:: MME | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 162–176
Language:
URL:: https://aclanthology.org/2026.mme-main.10/
DOI:
Bibkey:
Cite (ACL):: Ayush Nangia, Aman Gokrani, and Ruggero Marino Lazzaroni. 2026. UNSC-Bench: Evaluating LLM Diplomatic Role-Playing Through UN Security Council Vote Prediction. In Proceedings of the First Workshop on Multilingual Multicultural Evaluation, pages 162–176, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: UNSC-Bench: Evaluating LLM Diplomatic Role-Playing Through UN Security Council Vote Prediction (Nangia et al., MME 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.mme-main.10.pdf

PDF Cite Search Fix data