LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks in English

Santosh T.y.s.s, Cornelius Weiss, Matthias Grabmair


Abstract
In the evolving NLP landscape, benchmarks serve as yardsticks for gauging progress. However, existing Legal NLP benchmarks only focus on predictive tasks, overlooking generative tasks. This work curates LexSumm, a benchmark designed for evaluating legal summarization tasks in English. It comprises eight English legal summarization datasets, from diverse jurisdictions, such as the US, UK, EU and India. Additionally, we release LexT5, legal oriented sequence-to-sequence model, addressing the limitation of the existing BERT-style encoder-only models in the legal domain. We assess its capabilities through zero-shot probing on LegalLAMA and fine-tuning on LexSumm. Our analysis reveals abstraction and faithfulness errors even in summaries generated by zero-shot LLMs, indicating opportunities for further improvements. LexSumm benchmark and LexT5 model are available at https://github.com/TUMLegalTech/LexSumm-LexT5.
Anthology ID:
2024.nllp-1.35
Volume:
Proceedings of the Natural Legal Language Processing Workshop 2024
Month:
November
Year:
2024
Address:
Miami, FL, USA
Editors:
Nikolaos Aletras, Ilias Chalkidis, Leslie Barrett, Cătălina Goanță, Daniel Preoțiuc-Pietro, Gerasimos Spanakis
Venue:
NLLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
381–403
Language:
URL:
https://aclanthology.org/2024.nllp-1.35
DOI:
Bibkey:
Cite (ACL):
Santosh T.y.s.s, Cornelius Weiss, and Matthias Grabmair. 2024. LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks in English. In Proceedings of the Natural Legal Language Processing Workshop 2024, pages 381–403, Miami, FL, USA. Association for Computational Linguistics.
Cite (Informal):
LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks in English (T.y.s.s et al., NLLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nllp-1.35.pdf