AGRR 2019: Corpus for Gapping Resolution in Russian

Maria Ponomareva; Kira Droganova; Ivan Smurov; Tatiana Shavrina

doi:10.18653/v1/W19-3705

AGRR 2019: Corpus for Gapping Resolution in Russian

Maria Ponomareva, Kira Droganova, Ivan Smurov, Tatiana Shavrina

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use ... for bold, ... for italic, ... for underline, <sc>...</sc> for small-caps, <tt>...<tt> for typewriter text, <url>...</url> for URLs, <a href=...> for hyperlinks, and <par/> for paragraph breaks.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

This paper provides a comprehensive overview of the gapping dataset for Russian that consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a competition aimed at stimulating the development of NLP tools and methods for processing of ellipsis. In this paper, we pay special attention to the gapping resolution methods that were introduced within the shared task as well as an alternative test set that illustrates that our corpus is a diverse and representative subset of Russian language gapping sufficient for effective utilization of machine learning techniques.

Anthology ID:: W19-3705
Volume:: Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Tomaž Erjavec, Michał Marcińczuk, Preslav Nakov, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
Venue:: BSNLP
SIG:: SIGSLAV
Publisher:: Association for Computational Linguistics
Note:
Pages:: 35–43
Language:
URL:: https://aclanthology.org/W19-3705/
DOI:: 10.18653/v1/W19-3705
Bibkey:
Cite (ACL):: Maria Ponomareva, Kira Droganova, Ivan Smurov, and Tatiana Shavrina. 2019. AGRR 2019: Corpus for Gapping Resolution in Russian. In Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, pages 35–43, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: AGRR 2019: Corpus for Gapping Resolution in Russian (Ponomareva et al., BSNLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-3705.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{ponomareva-etal-2019-agrr,
    title = "{AGRR} 2019: Corpus for Gapping Resolution in {R}ussian",
    author = "Ponomareva, Maria  and
      Droganova, Kira  and
      Smurov, Ivan  and
      Shavrina, Tatiana",
    editor = "Erjavec, Toma{\v{z}}  and
      Marci{\'n}czuk, Micha{\l}  and
      Nakov, Preslav  and
      Piskorski, Jakub  and
      Pivovarova, Lidia  and
      {\v{S}}najder, Jan  and
      Steinberger, Josef  and
      Yangarber, Roman",
    booktitle = "Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing",
    month = aug,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W19-3705/",
    doi = "10.18653/v1/W19-3705",
    pages = "35--43",
    abstract = "This paper provides a comprehensive overview of the gapping dataset for Russian that consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a competition aimed at stimulating the development of NLP tools and methods for processing of ellipsis. In this paper, we pay special attention to the gapping resolution methods that were introduced within the shared task as well as an alternative test set that illustrates that our corpus is a diverse and representative subset of Russian language gapping sufficient for effective utilization of machine learning techniques."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="ponomareva-etal-2019-agrr">
    <titleInfo>
        <title>AGRR 2019: Corpus for Gapping Resolution in Russian</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Maria</namePart>
        <namePart type="family">Ponomareva</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Kira</namePart>
        <namePart type="family">Droganova</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Ivan</namePart>
        <namePart type="family">Smurov</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Tatiana</namePart>
        <namePart type="family">Shavrina</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2019-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Tomaž</namePart>
            <namePart type="family">Erjavec</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Michał</namePart>
            <namePart type="family">Marcińczuk</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Preslav</namePart>
            <namePart type="family">Nakov</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jakub</namePart>
            <namePart type="family">Piskorski</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Lidia</namePart>
            <namePart type="family">Pivovarova</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jan</namePart>
            <namePart type="family">Šnajder</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Josef</namePart>
            <namePart type="family">Steinberger</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Roman</namePart>
            <namePart type="family">Yangarber</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Florence, Italy</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>This paper provides a comprehensive overview of the gapping dataset for Russian that consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a competition aimed at stimulating the development of NLP tools and methods for processing of ellipsis. In this paper, we pay special attention to the gapping resolution methods that were introduced within the shared task as well as an alternative test set that illustrates that our corpus is a diverse and representative subset of Russian language gapping sufficient for effective utilization of machine learning techniques.</abstract>
    <identifier type="citekey">ponomareva-etal-2019-agrr</identifier>
    <identifier type="doi">10.18653/v1/W19-3705</identifier>
    <location>
        <url>https://aclanthology.org/W19-3705/</url>
    </location>
    <part>
        <date>2019-08</date>
        <extent unit="page">
            <start>35</start>
            <end>43</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T AGRR 2019: Corpus for Gapping Resolution in Russian
%A Ponomareva, Maria
%A Droganova, Kira
%A Smurov, Ivan
%A Shavrina, Tatiana
%Y Erjavec, Tomaž
%Y Marcińczuk, Michał
%Y Nakov, Preslav
%Y Piskorski, Jakub
%Y Pivovarova, Lidia
%Y Šnajder, Jan
%Y Steinberger, Josef
%Y Yangarber, Roman
%S Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing
%D 2019
%8 August
%I Association for Computational Linguistics
%C Florence, Italy
%F ponomareva-etal-2019-agrr
%X This paper provides a comprehensive overview of the gapping dataset for Russian that consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a competition aimed at stimulating the development of NLP tools and methods for processing of ellipsis. In this paper, we pay special attention to the gapping resolution methods that were introduced within the shared task as well as an alternative test set that illustrates that our corpus is a diverse and representative subset of Russian language gapping sufficient for effective utilization of machine learning techniques.
%R 10.18653/v1/W19-3705
%U https://aclanthology.org/W19-3705/
%U https://doi.org/10.18653/v1/W19-3705
%P 35-43

Download as File

Markdown (Informal)

[AGRR 2019: Corpus for Gapping Resolution in Russian](https://aclanthology.org/W19-3705/) (Ponomareva et al., BSNLP 2019)

AGRR 2019: Corpus for Gapping Resolution in Russian (Ponomareva et al., BSNLP 2019)

ACL

Maria Ponomareva, Kira Droganova, Ivan Smurov, and Tatiana Shavrina. 2019. AGRR 2019: Corpus for Gapping Resolution in Russian. In Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing, pages 35–43, Florence, Italy. Association for Computational Linguistics.