SLARD: A Chinese Superior Legal Article Retrieval Dataset

Zhe Chen; Pengjie Ren; Fuhui Sun; Xiaoyan Wang; Yujun Li; Siwen Zhao; Tengyi Yang

SLARD: A Chinese Superior Legal Article Retrieval Dataset

Zhe Chen, Pengjie Ren, Fuhui Sun, Xiaoyan Wang, Yujun Li, Siwen Zhao, Tengyi Yang

Abstract

Retrieving superior legal articles involves identifying relevant legal articles that hold higher legal effectiveness. This process is crucial in legislative work because superior legal articles form the legal basis for drafting new laws. However, most existing legal information retrieval research focuses on retrieving legal documents, with limited research on retrieving superior legal articles. This gap restricts the digitization of legislative work. To advance research in this area, we propose SLARD: A Chinese Superior Legal Article Retrieval Dataset, which filters 2,627 queries and 9,184 candidates from over 4.3 million effective Chinese regulations, covering 32 categories, such as environment, agriculture, and water resources. Each query is manually annotated, and the candidates include superior articles at both the provincial and national levels. We conducted detailed experiments and analyses on the dataset and found that existing retrieval methods struggle to achieve ideal results. The best method achieved a R@1 of only 0.4719. Additionally, we found that existing large language models (LLMs) lack prior knowledge of the content of superior legal articles. This indicates the necessity for further exploration and research in this field.

Anthology ID:: 2025.coling-main.50
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 740–754
Language:
URL:: https://aclanthology.org/2025.coling-main.50/
DOI:
Bibkey:
Cite (ACL):: Zhe Chen, Pengjie Ren, Fuhui Sun, Xiaoyan Wang, Yujun Li, Siwen Zhao, and Tengyi Yang. 2025. SLARD: A Chinese Superior Legal Article Retrieval Dataset. In Proceedings of the 31st International Conference on Computational Linguistics, pages 740–754, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: SLARD: A Chinese Superior Legal Article Retrieval Dataset (Chen et al., COLING 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.coling-main.50.pdf

PDF Cite Search Fix data