CMDL: A Large-Scale Chinese Multi-Defendant Legal Judgment Prediction Dataset

Wanhong Huang, Yi Feng, Chuanyi Li, Honghan Wu, Jidong Ge, Vincent Ng


Abstract
Legal Judgment Prediction (LJP) has attracted significant attention in recent years. However, previous studies have primarily focused on cases involving only a single defendant, skipping multi-defendant cases due to complexity and difficulty. To advance research, we introduce CMDL, a large-scale real-world Chinese Multi-Defendant LJP dataset, which consists of over 393,945 cases with nearly 1.2 million defendants in total. For performance evaluation, we propose case-level evaluation metrics dedicated for the multi-defendant scenario. Experimental results on CMDL show existing SOTA approaches demonstrate weakness when applied to cases involving multiple defendants. We highlight several challenges that require attention and resolution.
Anthology ID:
2024.findings-acl.351
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5895–5906
Language:
URL:
https://aclanthology.org/2024.findings-acl.351
DOI:
Bibkey:
Cite (ACL):
Wanhong Huang, Yi Feng, Chuanyi Li, Honghan Wu, Jidong Ge, and Vincent Ng. 2024. CMDL: A Large-Scale Chinese Multi-Defendant Legal Judgment Prediction Dataset. In Findings of the Association for Computational Linguistics ACL 2024, pages 5895–5906, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
CMDL: A Large-Scale Chinese Multi-Defendant Legal Judgment Prediction Dataset (Huang et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.351.pdf