Discourse Probing of Pretrained Language Models

Fajri Koto; Jey Han Lau; Timothy Baldwin

doi:10.18653/v1/2021.naacl-main.301

Discourse Probing of Pretrained Language Models

Fajri Koto, Jey Han Lau, Timothy Baldwin

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse — but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the different models, there are substantial differences in which layers best capture discourse information, and large disparities between models.

Anthology ID:: 2021.naacl-main.301
Volume:: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: June
Year:: 2021
Address:: Online
Editors:: Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3849–3864
Language:
URL:: https://aclanthology.org/2021.naacl-main.301/
DOI:: 10.18653/v1/2021.naacl-main.301
Bibkey:
Cite (ACL):: Fajri Koto, Jey Han Lau, and Timothy Baldwin. 2021. Discourse Probing of Pretrained Language Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3849–3864, Online. Association for Computational Linguistics.
Cite (Informal):: Discourse Probing of Pretrained Language Models (Koto et al., NAACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.naacl-main.301.pdf
Video:: https://aclanthology.org/2021.naacl-main.301.mp4

PDF Cite Search Video Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{koto-etal-2021-discourse,
    title = "Discourse Probing of Pretrained Language Models",
    author = "Koto, Fajri  and
      Lau, Jey Han  and
      Baldwin, Timothy",
    editor = "Toutanova, Kristina  and
      Rumshisky, Anna  and
      Zettlemoyer, Luke  and
      Hakkani-Tur, Dilek  and
      Beltagy, Iz  and
      Bethard, Steven  and
      Cotterell, Ryan  and
      Chakraborty, Tanmoy  and
      Zhou, Yichao",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.naacl-main.301/",
    doi = "10.18653/v1/2021.naacl-main.301",
    pages = "3849--3864",
    abstract = "Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse {---} but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the different models, there are substantial differences in which layers best capture discourse information, and large disparities between models."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="koto-etal-2021-discourse">
    <titleInfo>
        <title>Discourse Probing of Pretrained Language Models</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Fajri</namePart>
        <namePart type="family">Koto</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Jey</namePart>
        <namePart type="given">Han</namePart>
        <namePart type="family">Lau</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Timothy</namePart>
        <namePart type="family">Baldwin</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2021-06</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Kristina</namePart>
            <namePart type="family">Toutanova</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Anna</namePart>
            <namePart type="family">Rumshisky</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Luke</namePart>
            <namePart type="family">Zettlemoyer</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Dilek</namePart>
            <namePart type="family">Hakkani-Tur</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Iz</namePart>
            <namePart type="family">Beltagy</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Steven</namePart>
            <namePart type="family">Bethard</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Ryan</namePart>
            <namePart type="family">Cotterell</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Tanmoy</namePart>
            <namePart type="family">Chakraborty</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Yichao</namePart>
            <namePart type="family">Zhou</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Online</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse — but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the different models, there are substantial differences in which layers best capture discourse information, and large disparities between models.</abstract>
    <identifier type="citekey">koto-etal-2021-discourse</identifier>
    <identifier type="doi">10.18653/v1/2021.naacl-main.301</identifier>
    <location>
        <url>https://aclanthology.org/2021.naacl-main.301/</url>
    </location>
    <part>
        <date>2021-06</date>
        <extent unit="page">
            <start>3849</start>
            <end>3864</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Discourse Probing of Pretrained Language Models
%A Koto, Fajri
%A Lau, Jey Han
%A Baldwin, Timothy
%Y Toutanova, Kristina
%Y Rumshisky, Anna
%Y Zettlemoyer, Luke
%Y Hakkani-Tur, Dilek
%Y Beltagy, Iz
%Y Bethard, Steven
%Y Cotterell, Ryan
%Y Chakraborty, Tanmoy
%Y Zhou, Yichao
%S Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
%D 2021
%8 June
%I Association for Computational Linguistics
%C Online
%F koto-etal-2021-discourse
%X Existing work on probing of pretrained language models (LMs) has predominantly focused on sentence-level syntactic tasks. In this paper, we introduce document-level discourse probing to evaluate the ability of pretrained LMs to capture document-level relations. We experiment with 7 pretrained LMs, 4 languages, and 7 discourse probing tasks, and find BART to be overall the best model at capturing discourse — but only in its encoder, with BERT performing surprisingly well as the baseline model. Across the different models, there are substantial differences in which layers best capture discourse information, and large disparities between models.
%R 10.18653/v1/2021.naacl-main.301
%U https://aclanthology.org/2021.naacl-main.301/
%U https://doi.org/10.18653/v1/2021.naacl-main.301
%P 3849-3864

Download as File

Markdown (Informal)

[Discourse Probing of Pretrained Language Models](https://aclanthology.org/2021.naacl-main.301/) (Koto et al., NAACL 2021)

Discourse Probing of Pretrained Language Models (Koto et al., NAACL 2021)

ACL

Fajri Koto, Jey Han Lau, and Timothy Baldwin. 2021. Discourse Probing of Pretrained Language Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3849–3864, Online. Association for Computational Linguistics.