Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation

Ilias Chalkidis; Emmanouil Fergadiotis; Prodromos Malakasiotis; Nikolaos Aletras; Ion Androutsopoulos

doi:10.18653/v1/W19-2209

Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation

Ilias Chalkidis, Emmanouil Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos

Correct Metadata for

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, the European Union’s public document database, annotated with concepts from EUROVOC, a multidisciplinary thesaurus. The dataset is substantially larger than previous EURLEX datasets and suitable for XMTC, few-shot and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with self-attention outperform the current multi-label state-of-the-art methods, which employ label-wise attention. Replacing CNNs with BIGRUs in label-wise attention networks leads to the best overall performance.

Anthology ID:: W19-2209
Volume:: Proceedings of the Natural Legal Language Processing Workshop 2019
Month:: June
Year:: 2019
Address:: Minneapolis, Minnesota
Editors:: Nikolaos Aletras, Elliott Ash, Leslie Barrett, Daniel Chen, Adam Meyers, Daniel Preotiuc-Pietro, David Rosenberg, Amanda Stent
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 78–87
Language:
URL:: https://aclanthology.org/W19-2209/
DOI:: 10.18653/v1/W19-2209
Bibkey:
Cite (ACL):: Ilias Chalkidis, Emmanouil Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, and Ion Androutsopoulos. 2019. Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 78–87, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):: Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation (Chalkidis et al., NAACL 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-2209.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{chalkidis-etal-2019-extreme,
    title = "Extreme Multi-Label Legal Text Classification: A Case Study in {EU} Legislation",
    author = "Chalkidis, Ilias  and
      Fergadiotis, Emmanouil  and
      Malakasiotis, Prodromos  and
      Aletras, Nikolaos  and
      Androutsopoulos, Ion",
    editor = "Aletras, Nikolaos  and
      Ash, Elliott  and
      Barrett, Leslie  and
      Chen, Daniel  and
      Meyers, Adam  and
      Preotiuc-Pietro, Daniel  and
      Rosenberg, David  and
      Stent, Amanda",
    booktitle = "Proceedings of the Natural Legal Language Processing Workshop 2019",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W19-2209/",
    doi = "10.18653/v1/W19-2209",
    pages = "78--87",
    abstract = "We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, the European Union{'}s public document database, annotated with concepts from EUROVOC, a multidisciplinary thesaurus. The dataset is substantially larger than previous EURLEX datasets and suitable for XMTC, few-shot and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with self-attention outperform the current multi-label state-of-the-art methods, which employ label-wise attention. Replacing CNNs with BIGRUs in label-wise attention networks leads to the best overall performance."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="chalkidis-etal-2019-extreme">
    <titleInfo>
        <title>Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Ilias</namePart>
        <namePart type="family">Chalkidis</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Emmanouil</namePart>
        <namePart type="family">Fergadiotis</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Prodromos</namePart>
        <namePart type="family">Malakasiotis</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Nikolaos</namePart>
        <namePart type="family">Aletras</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Ion</namePart>
        <namePart type="family">Androutsopoulos</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2019-06</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Natural Legal Language Processing Workshop 2019</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Nikolaos</namePart>
            <namePart type="family">Aletras</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Elliott</namePart>
            <namePart type="family">Ash</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Leslie</namePart>
            <namePart type="family">Barrett</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Daniel</namePart>
            <namePart type="family">Chen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Adam</namePart>
            <namePart type="family">Meyers</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Daniel</namePart>
            <namePart type="family">Preotiuc-Pietro</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">David</namePart>
            <namePart type="family">Rosenberg</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Amanda</namePart>
            <namePart type="family">Stent</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Minneapolis, Minnesota</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, the European Union’s public document database, annotated with concepts from EUROVOC, a multidisciplinary thesaurus. The dataset is substantially larger than previous EURLEX datasets and suitable for XMTC, few-shot and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with self-attention outperform the current multi-label state-of-the-art methods, which employ label-wise attention. Replacing CNNs with BIGRUs in label-wise attention networks leads to the best overall performance.</abstract>
    <identifier type="citekey">chalkidis-etal-2019-extreme</identifier>
    <identifier type="doi">10.18653/v1/W19-2209</identifier>
    <location>
        <url>https://aclanthology.org/W19-2209/</url>
    </location>
    <part>
        <date>2019-06</date>
        <extent unit="page">
            <start>78</start>
            <end>87</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation
%A Chalkidis, Ilias
%A Fergadiotis, Emmanouil
%A Malakasiotis, Prodromos
%A Aletras, Nikolaos
%A Androutsopoulos, Ion
%Y Aletras, Nikolaos
%Y Ash, Elliott
%Y Barrett, Leslie
%Y Chen, Daniel
%Y Meyers, Adam
%Y Preotiuc-Pietro, Daniel
%Y Rosenberg, David
%Y Stent, Amanda
%S Proceedings of the Natural Legal Language Processing Workshop 2019
%D 2019
%8 June
%I Association for Computational Linguistics
%C Minneapolis, Minnesota
%F chalkidis-etal-2019-extreme
%X We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain. We release a new dataset of 57k legislative documents from EURLEX, the European Union’s public document database, annotated with concepts from EUROVOC, a multidisciplinary thesaurus. The dataset is substantially larger than previous EURLEX datasets and suitable for XMTC, few-shot and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with self-attention outperform the current multi-label state-of-the-art methods, which employ label-wise attention. Replacing CNNs with BIGRUs in label-wise attention networks leads to the best overall performance.
%R 10.18653/v1/W19-2209
%U https://aclanthology.org/W19-2209/
%U https://doi.org/10.18653/v1/W19-2209
%P 78-87

Download as File

Markdown (Informal)

[Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation](https://aclanthology.org/W19-2209/) (Chalkidis et al., NAACL 2019)

Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation (Chalkidis et al., NAACL 2019)

ACL

Ilias Chalkidis, Emmanouil Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, and Ion Androutsopoulos. 2019. Extreme Multi-Label Legal Text Classification: A Case Study in EU Legislation. In Proceedings of the Natural Legal Language Processing Workshop 2019, pages 78–87, Minneapolis, Minnesota. Association for Computational Linguistics.