Adam L. Meyers
2021
Legal Terminology Extraction with the Termolator
Nhi Pham
|
Lachlan Pham
|
Adam L. Meyers
Proceedings of the Natural Legal Language Processing Workshop 2021
Domain-specific terminology is ubiquitous in legal documents. Despite potential utility in populating glossaries and ontologies or as arguments in information extraction and document classification tasks, there has been limited work done for legal terminology extraction. This paper describes some work to remedy this omission. In the described research, we make some modifications to the Termolator, a high-performing, open-source terminology extractor which has been tuned to scientific articles. Our changes are designed to improve the Termolator’s results when applied to United States Supreme Court decisions. Unaltered and using the recommended settings, the original Termolator provides a list of terminology with a precision of 23% and 25% for the categories of economic activity (development set) and criminal procedures (test set) respectively. These were the most frequently occurring broad issues in Washington University in St. Louis Database corpus, a database of Supreme Court decisions that have been manually classified by topic. Our contribution includes the introduction of several legal domain-specific filtration steps and changes to the web search relevance score; each incrementally improved precision culminating in a combined precision of 63% and 65%. We also evaluated the baseline version of the Termolator on more specific subcategories and on broad issues with fewer cases. Our results show that a narrowed scope as well as smaller document numbers significantly lower the precision. In both cases, the modifications to the Termolator improve precision.