Effectively Leveraging BERT for Legal Document Classification

Nut Limsopatham


Abstract
Bidirectional Encoder Representations from Transformers (BERT) has achieved state-of-the-art performances on several text classification tasks, such as GLUE and sentiment analysis. Recent work in the legal domain started to use BERT on tasks, such as legal judgement prediction and violation prediction. A common practise in using BERT is to fine-tune a pre-trained model on a target task and truncate the input texts to the size of the BERT input (e.g. at most 512 tokens). However, due to the unique characteristics of legal documents, it is not clear how to effectively adapt BERT in the legal domain. In this work, we investigate how to deal with long documents, and how is the importance of pre-training on documents from the same domain as the target task. We conduct experiments on the two recent datasets: ECHR Violation Dataset and the Overruling Task Dataset, which are multi-label and binary classification tasks, respectively. Importantly, on average the number of tokens in a document from the ECHR Violation Dataset is more than 1,600. While the documents in the Overruling Task Dataset are shorter (the maximum number of tokens is 204). We thoroughly compare several techniques for adapting BERT on long documents and compare different models pre-trained on the legal and other domains. Our experimental results show that we need to explicitly adapt BERT to handle long documents, as the truncation leads to less effective performance. We also found that pre-training on the documents that are similar to the target task would result in more effective performance on several scenario.
Anthology ID:
2021.nllp-1.22
Volume:
Proceedings of the Natural Legal Language Processing Workshop 2021
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Editors:
Nikolaos Aletras, Ion Androutsopoulos, Leslie Barrett, Catalina Goanta, Daniel Preotiuc-Pietro
Venue:
NLLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
210–216
Language:
URL:
https://aclanthology.org/2021.nllp-1.22
DOI:
10.18653/v1/2021.nllp-1.22
Bibkey:
Cite (ACL):
Nut Limsopatham. 2021. Effectively Leveraging BERT for Legal Document Classification. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 210–216, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Effectively Leveraging BERT for Legal Document Classification (Limsopatham, NLLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nllp-1.22.pdf
Data
ECHRGLUEOverruling