DEFT: A corpus for definition extraction in free- and semi-structured text

Sasha Spala; Nicholas A. Miller; Yiming Yang (杨亦鸣); Franck Dernoncourt; Carl Dockhorn

doi:10.18653/v1/W19-4015

DEFT: A corpus for definition extraction in free- and semi-structured text

Sasha Spala, Nicholas A. Miller, Yiming Yang, Franck Dernoncourt, Carl Dockhorn

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use ... for bold, ... for italic, ... for underline, <sc>...</sc> for small-caps, <tt>...<tt> for typewriter text, <url>...</url> for URLs, <a href=...> for hyperlinks, and <par/> for paragraph breaks.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Definition extraction has been a popular topic in NLP research for well more than a decade, but has been historically limited to well-defined, structured, and narrow conditions. In reality, natural language is messy, and messy data requires both complex solutions and data that reflects that reality. In this paper, we present a robust English corpus and annotation schema that allows us to explore the less straightforward examples of term-definition structures in free and semi-structured text.

Anthology ID:: W19-4015
Volume:: Proceedings of the 13th Linguistic Annotation Workshop
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Annemarie Friedrich, Deniz Zeyrek, Jet Hoek
Venue:: LAW
SIG:: SIGANN
Publisher:: Association for Computational Linguistics
Note:
Pages:: 124–131
Language:
URL:: https://aclanthology.org/W19-4015/
DOI:: 10.18653/v1/W19-4015
Bibkey:
Cite (ACL):: Sasha Spala, Nicholas A. Miller, Yiming Yang, Franck Dernoncourt, and Carl Dockhorn. 2019. DEFT: A corpus for definition extraction in free- and semi-structured text. In Proceedings of the 13th Linguistic Annotation Workshop, pages 124–131, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: DEFT: A corpus for definition extraction in free- and semi-structured text (Spala et al., LAW 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-4015.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{spala-etal-2019-deft,
    title = "{DEFT}: A corpus for definition extraction in free- and semi-structured text",
    author = "Spala, Sasha  and
      Miller, Nicholas A.  and
      Yang, Yiming  and
      Dernoncourt, Franck  and
      Dockhorn, Carl",
    editor = "Friedrich, Annemarie  and
      Zeyrek, Deniz  and
      Hoek, Jet",
    booktitle = "Proceedings of the 13th Linguistic Annotation Workshop",
    month = aug,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W19-4015/",
    doi = "10.18653/v1/W19-4015",
    pages = "124--131",
    abstract = "Definition extraction has been a popular topic in NLP research for well more than a decade, but has been historically limited to well-defined, structured, and narrow conditions. In reality, natural language is messy, and messy data requires both complex solutions and data that reflects that reality. In this paper, we present a robust English corpus and annotation schema that allows us to explore the less straightforward examples of term-definition structures in free and semi-structured text."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="spala-etal-2019-deft">
    <titleInfo>
        <title>DEFT: A corpus for definition extraction in free- and semi-structured text</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Sasha</namePart>
        <namePart type="family">Spala</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Nicholas</namePart>
        <namePart type="given">A</namePart>
        <namePart type="family">Miller</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Yiming</namePart>
        <namePart type="family">Yang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Franck</namePart>
        <namePart type="family">Dernoncourt</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Carl</namePart>
        <namePart type="family">Dockhorn</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2019-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 13th Linguistic Annotation Workshop</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Annemarie</namePart>
            <namePart type="family">Friedrich</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Deniz</namePart>
            <namePart type="family">Zeyrek</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jet</namePart>
            <namePart type="family">Hoek</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Florence, Italy</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Definition extraction has been a popular topic in NLP research for well more than a decade, but has been historically limited to well-defined, structured, and narrow conditions. In reality, natural language is messy, and messy data requires both complex solutions and data that reflects that reality. In this paper, we present a robust English corpus and annotation schema that allows us to explore the less straightforward examples of term-definition structures in free and semi-structured text.</abstract>
    <identifier type="citekey">spala-etal-2019-deft</identifier>
    <identifier type="doi">10.18653/v1/W19-4015</identifier>
    <location>
        <url>https://aclanthology.org/W19-4015/</url>
    </location>
    <part>
        <date>2019-08</date>
        <extent unit="page">
            <start>124</start>
            <end>131</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T DEFT: A corpus for definition extraction in free- and semi-structured text
%A Spala, Sasha
%A Miller, Nicholas A.
%A Yang, Yiming
%A Dernoncourt, Franck
%A Dockhorn, Carl
%Y Friedrich, Annemarie
%Y Zeyrek, Deniz
%Y Hoek, Jet
%S Proceedings of the 13th Linguistic Annotation Workshop
%D 2019
%8 August
%I Association for Computational Linguistics
%C Florence, Italy
%F spala-etal-2019-deft
%X Definition extraction has been a popular topic in NLP research for well more than a decade, but has been historically limited to well-defined, structured, and narrow conditions. In reality, natural language is messy, and messy data requires both complex solutions and data that reflects that reality. In this paper, we present a robust English corpus and annotation schema that allows us to explore the less straightforward examples of term-definition structures in free and semi-structured text.
%R 10.18653/v1/W19-4015
%U https://aclanthology.org/W19-4015/
%U https://doi.org/10.18653/v1/W19-4015
%P 124-131

Download as File

Markdown (Informal)

[DEFT: A corpus for definition extraction in free- and semi-structured text](https://aclanthology.org/W19-4015/) (Spala et al., LAW 2019)

DEFT: A corpus for definition extraction in free- and semi-structured text (Spala et al., LAW 2019)

ACL

Sasha Spala, Nicholas A. Miller, Yiming Yang, Franck Dernoncourt, and Carl Dockhorn. 2019. DEFT: A corpus for definition extraction in free- and semi-structured text. In Proceedings of the 13th Linguistic Annotation Workshop, pages 124–131, Florence, Italy. Association for Computational Linguistics.