@inproceedings{pham-etal-2021-legal,
title = "Legal Terminology Extraction with the Termolator",
author = "Pham, Nhi and
Pham, Lachlan and
Meyers, Adam L.",
editor = "Aletras, Nikolaos and
Androutsopoulos, Ion and
Barrett, Leslie and
Goanta, Catalina and
Preotiuc-Pietro, Daniel",
booktitle = "Proceedings of the Natural Legal Language Processing Workshop 2021",
month = nov,
year = "2021",
address = "Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.nllp-1.16",
doi = "10.18653/v1/2021.nllp-1.16",
pages = "155--162",
abstract = "Domain-specific terminology is ubiquitous in legal documents. Despite potential utility in populating glossaries and ontologies or as arguments in information extraction and document classification tasks, there has been limited work done for legal terminology extraction. This paper describes some work to remedy this omission. In the described research, we make some modifications to the Termolator, a high-performing, open-source terminology extractor which has been tuned to scientific articles. Our changes are designed to improve the Termolator{'}s results when applied to United States Supreme Court decisions. Unaltered and using the recommended settings, the original Termolator provides a list of terminology with a precision of 23{\%} and 25{\%} for the categories of economic activity (development set) and criminal procedures (test set) respectively. These were the most frequently occurring broad issues in Washington University in St. Louis Database corpus, a database of Supreme Court decisions that have been manually classified by topic. Our contribution includes the introduction of several legal domain-specific filtration steps and changes to the web search relevance score; each incrementally improved precision culminating in a combined precision of 63{\%} and 65{\%}. We also evaluated the baseline version of the Termolator on more specific subcategories and on broad issues with fewer cases. Our results show that a narrowed scope as well as smaller document numbers significantly lower the precision. In both cases, the modifications to the Termolator improve precision.",
}
<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="pham-etal-2021-legal">
<titleInfo>
<title>Legal Terminology Extraction with the Termolator</title>
</titleInfo>
<name type="personal">
<namePart type="given">Nhi</namePart>
<namePart type="family">Pham</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Lachlan</namePart>
<namePart type="family">Pham</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Adam</namePart>
<namePart type="given">L</namePart>
<namePart type="family">Meyers</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<originInfo>
<dateIssued>2021-11</dateIssued>
</originInfo>
<typeOfResource>text</typeOfResource>
<relatedItem type="host">
<titleInfo>
<title>Proceedings of the Natural Legal Language Processing Workshop 2021</title>
</titleInfo>
<name type="personal">
<namePart type="given">Nikolaos</namePart>
<namePart type="family">Aletras</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Ion</namePart>
<namePart type="family">Androutsopoulos</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Leslie</namePart>
<namePart type="family">Barrett</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Catalina</namePart>
<namePart type="family">Goanta</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Daniel</namePart>
<namePart type="family">Preotiuc-Pietro</namePart>
<role>
<roleTerm authority="marcrelator" type="text">editor</roleTerm>
</role>
</name>
<originInfo>
<publisher>Association for Computational Linguistics</publisher>
<place>
<placeTerm type="text">Punta Cana, Dominican Republic</placeTerm>
</place>
</originInfo>
<genre authority="marcgt">conference publication</genre>
</relatedItem>
<abstract>Domain-specific terminology is ubiquitous in legal documents. Despite potential utility in populating glossaries and ontologies or as arguments in information extraction and document classification tasks, there has been limited work done for legal terminology extraction. This paper describes some work to remedy this omission. In the described research, we make some modifications to the Termolator, a high-performing, open-source terminology extractor which has been tuned to scientific articles. Our changes are designed to improve the Termolator’s results when applied to United States Supreme Court decisions. Unaltered and using the recommended settings, the original Termolator provides a list of terminology with a precision of 23% and 25% for the categories of economic activity (development set) and criminal procedures (test set) respectively. These were the most frequently occurring broad issues in Washington University in St. Louis Database corpus, a database of Supreme Court decisions that have been manually classified by topic. Our contribution includes the introduction of several legal domain-specific filtration steps and changes to the web search relevance score; each incrementally improved precision culminating in a combined precision of 63% and 65%. We also evaluated the baseline version of the Termolator on more specific subcategories and on broad issues with fewer cases. Our results show that a narrowed scope as well as smaller document numbers significantly lower the precision. In both cases, the modifications to the Termolator improve precision.</abstract>
<identifier type="citekey">pham-etal-2021-legal</identifier>
<identifier type="doi">10.18653/v1/2021.nllp-1.16</identifier>
<location>
<url>https://aclanthology.org/2021.nllp-1.16</url>
</location>
<part>
<date>2021-11</date>
<extent unit="page">
<start>155</start>
<end>162</end>
</extent>
</part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T Legal Terminology Extraction with the Termolator
%A Pham, Nhi
%A Pham, Lachlan
%A Meyers, Adam L.
%Y Aletras, Nikolaos
%Y Androutsopoulos, Ion
%Y Barrett, Leslie
%Y Goanta, Catalina
%Y Preotiuc-Pietro, Daniel
%S Proceedings of the Natural Legal Language Processing Workshop 2021
%D 2021
%8 November
%I Association for Computational Linguistics
%C Punta Cana, Dominican Republic
%F pham-etal-2021-legal
%X Domain-specific terminology is ubiquitous in legal documents. Despite potential utility in populating glossaries and ontologies or as arguments in information extraction and document classification tasks, there has been limited work done for legal terminology extraction. This paper describes some work to remedy this omission. In the described research, we make some modifications to the Termolator, a high-performing, open-source terminology extractor which has been tuned to scientific articles. Our changes are designed to improve the Termolator’s results when applied to United States Supreme Court decisions. Unaltered and using the recommended settings, the original Termolator provides a list of terminology with a precision of 23% and 25% for the categories of economic activity (development set) and criminal procedures (test set) respectively. These were the most frequently occurring broad issues in Washington University in St. Louis Database corpus, a database of Supreme Court decisions that have been manually classified by topic. Our contribution includes the introduction of several legal domain-specific filtration steps and changes to the web search relevance score; each incrementally improved precision culminating in a combined precision of 63% and 65%. We also evaluated the baseline version of the Termolator on more specific subcategories and on broad issues with fewer cases. Our results show that a narrowed scope as well as smaller document numbers significantly lower the precision. In both cases, the modifications to the Termolator improve precision.
%R 10.18653/v1/2021.nllp-1.16
%U https://aclanthology.org/2021.nllp-1.16
%U https://doi.org/10.18653/v1/2021.nllp-1.16
%P 155-162
Markdown (Informal)
[Legal Terminology Extraction with the Termolator](https://aclanthology.org/2021.nllp-1.16) (Pham et al., NLLP 2021)
ACL
- Nhi Pham, Lachlan Pham, and Adam L. Meyers. 2021. Legal Terminology Extraction with the Termolator. In Proceedings of the Natural Legal Language Processing Workshop 2021, pages 155–162, Punta Cana, Dominican Republic. Association for Computational Linguistics.