Probing Multilingual Sentence Representations With X-Probe

Vinit Ravishankar; Lilja Øvrelid; Erik Velldal

doi:10.18653/v1/W19-4318

Probing Multilingual Sentence Representations With X-Probe

Vinit Ravishankar, Lilja Øvrelid, Erik Velldal

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain. In doing so, we make two contributions: first, we provide datasets for multilingual probing, derived from Wikipedia, in five languages, viz. English, French, German, Spanish and Russian. Second, we evaluate six sentence encoders for each language, each trained by mapping sentence representations to English sentence representations, using sentences in a parallel corpus. We discover that cross-lingually mapped representations are often better at retaining certain linguistic information than representations derived from English encoders trained on natural language inference (NLI) as a downstream task.

Anthology ID:: W19-4318
Volume:: Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Isabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Johannes Welbl, Alexis Conneau, Xiang Ren, Marek Rei
Venue:: RepL4NLP
SIG:: SIGREP
Publisher:: Association for Computational Linguistics
Note:
Pages:: 156–168
Language:
URL:: https://aclanthology.org/W19-4318/
DOI:: 10.18653/v1/W19-4318
Bibkey:
Cite (ACL):: Vinit Ravishankar, Lilja Øvrelid, and Erik Velldal. 2019. Probing Multilingual Sentence Representations With X-Probe. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pages 156–168, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: Probing Multilingual Sentence Representations With X-Probe (Ravishankar et al., RepL4NLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/W19-4318.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{ravishankar-etal-2019-probing,
    title = "Probing Multilingual Sentence Representations With {X}-Probe",
    author = "Ravishankar, Vinit  and
      {\O}vrelid, Lilja  and
      Velldal, Erik",
    editor = "Augenstein, Isabelle  and
      Gella, Spandana  and
      Ruder, Sebastian  and
      Kann, Katharina  and
      Can, Burcu  and
      Welbl, Johannes  and
      Conneau, Alexis  and
      Ren, Xiang  and
      Rei, Marek",
    booktitle = "Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)",
    month = aug,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W19-4318/",
    doi = "10.18653/v1/W19-4318",
    pages = "156--168",
    abstract = "This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain. In doing so, we make two contributions: first, we provide datasets for multilingual probing, derived from Wikipedia, in five languages, viz. English, French, German, Spanish and Russian. Second, we evaluate six sentence encoders for each language, each trained by mapping sentence representations to English sentence representations, using sentences in a parallel corpus. We discover that cross-lingually mapped representations are often better at retaining certain linguistic information than representations derived from English encoders trained on natural language inference (NLI) as a downstream task."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="ravishankar-etal-2019-probing">
    <titleInfo>
        <title>Probing Multilingual Sentence Representations With X-Probe</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Vinit</namePart>
        <namePart type="family">Ravishankar</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Lilja</namePart>
        <namePart type="family">Øvrelid</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Erik</namePart>
        <namePart type="family">Velldal</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2019-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Isabelle</namePart>
            <namePart type="family">Augenstein</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Spandana</namePart>
            <namePart type="family">Gella</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sebastian</namePart>
            <namePart type="family">Ruder</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Katharina</namePart>
            <namePart type="family">Kann</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Burcu</namePart>
            <namePart type="family">Can</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Johannes</namePart>
            <namePart type="family">Welbl</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Alexis</namePart>
            <namePart type="family">Conneau</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Xiang</namePart>
            <namePart type="family">Ren</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Marek</namePart>
            <namePart type="family">Rei</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Florence, Italy</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain. In doing so, we make two contributions: first, we provide datasets for multilingual probing, derived from Wikipedia, in five languages, viz. English, French, German, Spanish and Russian. Second, we evaluate six sentence encoders for each language, each trained by mapping sentence representations to English sentence representations, using sentences in a parallel corpus. We discover that cross-lingually mapped representations are often better at retaining certain linguistic information than representations derived from English encoders trained on natural language inference (NLI) as a downstream task.</abstract>
    <identifier type="citekey">ravishankar-etal-2019-probing</identifier>
    <identifier type="doi">10.18653/v1/W19-4318</identifier>
    <location>
        <url>https://aclanthology.org/W19-4318/</url>
    </location>
    <part>
        <date>2019-08</date>
        <extent unit="page">
            <start>156</start>
            <end>168</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Probing Multilingual Sentence Representations With X-Probe
%A Ravishankar, Vinit
%A Øvrelid, Lilja
%A Velldal, Erik
%Y Augenstein, Isabelle
%Y Gella, Spandana
%Y Ruder, Sebastian
%Y Kann, Katharina
%Y Can, Burcu
%Y Welbl, Johannes
%Y Conneau, Alexis
%Y Ren, Xiang
%Y Rei, Marek
%S Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
%D 2019
%8 August
%I Association for Computational Linguistics
%C Florence, Italy
%F ravishankar-etal-2019-probing
%X This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain. In doing so, we make two contributions: first, we provide datasets for multilingual probing, derived from Wikipedia, in five languages, viz. English, French, German, Spanish and Russian. Second, we evaluate six sentence encoders for each language, each trained by mapping sentence representations to English sentence representations, using sentences in a parallel corpus. We discover that cross-lingually mapped representations are often better at retaining certain linguistic information than representations derived from English encoders trained on natural language inference (NLI) as a downstream task.
%R 10.18653/v1/W19-4318
%U https://aclanthology.org/W19-4318/
%U https://doi.org/10.18653/v1/W19-4318
%P 156-168

Download as File

Markdown (Informal)

[Probing Multilingual Sentence Representations With X-Probe](https://aclanthology.org/W19-4318/) (Ravishankar et al., RepL4NLP 2019)

Probing Multilingual Sentence Representations With X-Probe (Ravishankar et al., RepL4NLP 2019)

ACL

Vinit Ravishankar, Lilja Øvrelid, and Erik Velldal. 2019. Probing Multilingual Sentence Representations With X-Probe. In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pages 156–168, Florence, Italy. Association for Computational Linguistics.