Super-SCOTUS: A multi-sourced dataset for the Supreme Court of the US

Biaoyan Fang; Trevor Cohn; Timothy Baldwin; Lea Frermann

doi:10.18653/v1/2023.nllp-1.20

Super-SCOTUS: A multi-sourced dataset for the Supreme Court of the US

Biaoyan Fang, Trevor Cohn, Timothy Baldwin, Lea Frermann

Abstract

Given the complexity of the judiciary in the US Supreme Court, various procedures, along with various resources, contribute to the court system. However, most research focuses on a limited set of resources, e.g., court opinions or oral arguments, for analyzing a specific perspective in court, e.g., partisanship or voting. To gain a fuller understanding of these perspectives in the legal system of the US Supreme Court, a more comprehensive dataset, connecting different sources in different phases of the court procedure, is needed. To address this gap, we present a multi-sourced dataset for the Supreme Court, comprising court resources from different procedural phases, connecting language documents with extensive metadata. We showcase its utility through a case study on how different court documents reveal the decision direction (conservative vs. liberal) of the cases. We analyze performance differences across three protected attributes, indicating that different court resources encode different biases, and reinforcing that considering various resources provides a fuller picture of the court procedures. We further discuss how our dataset can contribute to future research directions.

Anthology ID:: 2023.nllp-1.20
Volume:: Proceedings of the Natural Legal Language Processing Workshop 2023
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Daniel Preoțiuc-Pietro, Catalina Goanta, Ilias Chalkidis, Leslie Barrett, Gerasimos Spanakis, Nikolaos Aletras
Venues:: NLLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 202–214
Language:
URL:: https://aclanthology.org/2023.nllp-1.20/
DOI:: 10.18653/v1/2023.nllp-1.20
Bibkey:
Cite (ACL):: Biaoyan Fang, Trevor Cohn, Timothy Baldwin, and Lea Frermann. 2023. Super-SCOTUS: A multi-sourced dataset for the Supreme Court of the US. In Proceedings of the Natural Legal Language Processing Workshop 2023, pages 202–214, Singapore. Association for Computational Linguistics.
Cite (Informal):: Super-SCOTUS: A multi-sourced dataset for the Supreme Court of the US (Fang et al., NLLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.nllp-1.20.pdf
Video:: https://aclanthology.org/2023.nllp-1.20.mp4

PDF Cite Search Video Fix data