Learning from Limited Labels for Long Legal Dialogue

Jenny Hong; Derek Chong; Christopher D. Manning

doi:10.18653/v1/2021.nllp-1.20

Learning from Limited Labels for Long Legal Dialogue

Jenny Hong, Derek Chong, Christopher Manning

Abstract

We study attempting to achieve high accuracy information extraction of case factors from a challenging dataset of parole hearings, which, compared to other legal NLP datasets, has longer texts, with fewer labels. On this corpus, existing work directly applying pretrained neural models has failed to extract all but a few relatively basic items with little improvement over rule-based extraction. We address two challenges posed by existing work: training on long documents and reasoning over complex speech patterns. We use a similar approach to the two-step open-domain question answering approach by using a Reducer to extract relevant text segments and a Producer to generate both extractive answers and non-extractive classifications. In a context like ours, with limited labeled data, we show that a superior approach for strong performance within limited development time is to use a combination of a rule-based Reducer and a neural Producer. We study four representative tasks from the parole dataset. On all four, we improve extraction from the previous benchmark of 0.41–0.63 to 0.83–0.89 F1.

Anthology ID:: 2021.nllp-1.20
Volume:: Proceedings of the Natural Legal Language Processing Workshop 2021
Month:: November
Year:: 2021
Address:: Punta Cana, Dominican Republic
Editors:: Nikolaos Aletras, Ion Androutsopoulos, Leslie Barrett, Catalina Goanta, Daniel Preotiuc-Pietro
Venue:: NLLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 190–204
Language:
URL:: https://aclanthology.org/2021.nllp-1.20/
DOI:: 10.18653/v1/2021.nllp-1.20
Bibkey:
Cite (ACL):: Jenny Hong, Derek Chong, and Christopher Manning. 2021. Learning from Limited Labels for Long Legal Dialogue. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 190–204, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Learning from Limited Labels for Long Legal Dialogue (Hong et al., NLLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.nllp-1.20.pdf

PDF Cite Search Fix data