XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

Zewen Chi; Shaohan Huang; Li Dong; Shuming Ma; Bo Zheng; Saksham Singhal; Payal Bajaj; Xia Song; Xian-Ling Mao; He-Yan Huang (黄河燕); Furu Wei

doi:10.18653/v1/2022.acl-long.427

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, Furu Wei

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.

Anthology ID:: 2022.acl-long.427
Volume:: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6170–6182
Language:
URL:: https://aclanthology.org/2022.acl-long.427/
DOI:: 10.18653/v1/2022.acl-long.427
Bibkey:
Cite (ACL):: Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, and Furu Wei. 2022. XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6170–6182, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: XLM-E: Cross-lingual Language Model Pre-training via ELECTRA (Chi et al., ACL 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.acl-long.427.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{chi-etal-2022-xlm,
    title = "{XLM}-{E}: Cross-lingual Language Model Pre-training via {ELECTRA}",
    author = "Chi, Zewen  and
      Huang, Shaohan  and
      Dong, Li  and
      Ma, Shuming  and
      Zheng, Bo  and
      Singhal, Saksham  and
      Bajaj, Payal  and
      Song, Xia  and
      Mao, Xian-Ling  and
      Huang, Heyan  and
      Wei, Furu",
    editor = "Muresan, Smaranda  and
      Nakov, Preslav  and
      Villavicencio, Aline",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.427/",
    doi = "10.18653/v1/2022.acl-long.427",
    pages = "6170--6182",
    abstract = "In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="chi-etal-2022-xlm">
    <titleInfo>
        <title>XLM-E: Cross-lingual Language Model Pre-training via ELECTRA</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Zewen</namePart>
        <namePart type="family">Chi</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Shaohan</namePart>
        <namePart type="family">Huang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Li</namePart>
        <namePart type="family">Dong</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Shuming</namePart>
        <namePart type="family">Ma</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Bo</namePart>
        <namePart type="family">Zheng</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Saksham</namePart>
        <namePart type="family">Singhal</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Payal</namePart>
        <namePart type="family">Bajaj</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Xia</namePart>
        <namePart type="family">Song</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Xian-Ling</namePart>
        <namePart type="family">Mao</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Heyan</namePart>
        <namePart type="family">Huang</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Furu</namePart>
        <namePart type="family">Wei</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2022-05</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Smaranda</namePart>
            <namePart type="family">Muresan</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Preslav</namePart>
            <namePart type="family">Nakov</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Aline</namePart>
            <namePart type="family">Villavicencio</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Dublin, Ireland</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.</abstract>
    <identifier type="citekey">chi-etal-2022-xlm</identifier>
    <identifier type="doi">10.18653/v1/2022.acl-long.427</identifier>
    <location>
        <url>https://aclanthology.org/2022.acl-long.427/</url>
    </location>
    <part>
        <date>2022-05</date>
        <extent unit="page">
            <start>6170</start>
            <end>6182</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
%A Chi, Zewen
%A Huang, Shaohan
%A Dong, Li
%A Ma, Shuming
%A Zheng, Bo
%A Singhal, Saksham
%A Bajaj, Payal
%A Song, Xia
%A Mao, Xian-Ling
%A Huang, Heyan
%A Wei, Furu
%Y Muresan, Smaranda
%Y Nakov, Preslav
%Y Villavicencio, Aline
%S Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
%D 2022
%8 May
%I Association for Computational Linguistics
%C Dublin, Ireland
%F chi-etal-2022-xlm
%X In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.
%R 10.18653/v1/2022.acl-long.427
%U https://aclanthology.org/2022.acl-long.427/
%U https://doi.org/10.18653/v1/2022.acl-long.427
%P 6170-6182

Download as File

Markdown (Informal)

[XLM-E: Cross-lingual Language Model Pre-training via ELECTRA](https://aclanthology.org/2022.acl-long.427/) (Chi et al., ACL 2022)

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA (Chi et al., ACL 2022)

ACL

Zewen Chi, Shaohan Huang, Li Dong, Shuming Ma, Bo Zheng, Saksham Singhal, Payal Bajaj, Xia Song, Xian-Ling Mao, Heyan Huang, and Furu Wei. 2022. XLM-E: Cross-lingual Language Model Pre-training via ELECTRA. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 6170–6182, Dublin, Ireland. Association for Computational Linguistics.