Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers

Kamil Bujel; Helen Yannakoudakis; Marek Rei

doi:10.18653/v1/2021.repl4nlp-1.20

Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers

Kamil Bujel, Helen Yannakoudakis, Marek Rei

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

We investigate how sentence-level transformers can be modified into effective sequence labelers at the token level without any direct supervision. Existing approaches to zero-shot sequence labeling do not perform well when applied on transformer-based architectures. As transformers contain multiple layers of multi-head self-attention, information in the sentence gets distributed between many tokens, negatively affecting zero-shot token-level performance. We find that a soft attention module which explicitly encourages sharpness of attention weights can significantly outperform existing methods.

Anthology ID:: 2021.repl4nlp-1.20
Volume:: Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021)
Month:: August
Year:: 2021
Address:: Online
Editors:: Anna Rogers, Iacer Calixto, Ivan Vulić, Naomi Saphra, Nora Kassner, Oana-Maria Camburu, Trapit Bansal, Vered Shwartz
Venue:: RepL4NLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 195–205
Language:
URL:: https://aclanthology.org/2021.repl4nlp-1.20/
DOI:: 10.18653/v1/2021.repl4nlp-1.20
Bibkey:
Cite (ACL):: Kamil Bujel, Helen Yannakoudakis, and Marek Rei. 2021. Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), pages 195–205, Online. Association for Computational Linguistics.
Cite (Informal):: Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers (Bujel et al., RepL4NLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.repl4nlp-1.20.pdf
Optionalsupplementarymaterial:: 2021.repl4nlp-1.20.OptionalSupplementaryMaterial.zip
Video:: https://aclanthology.org/2021.repl4nlp-1.20.mp4

PDF Cite Search Optionalsupplementarymaterial Video Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{bujel-etal-2021-zero,
    title = "Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers",
    author = "Bujel, Kamil  and
      Yannakoudakis, Helen  and
      Rei, Marek",
    editor = "Rogers, Anna  and
      Calixto, Iacer  and
      Vuli{\'c}, Ivan  and
      Saphra, Naomi  and
      Kassner, Nora  and
      Camburu, Oana-Maria  and
      Bansal, Trapit  and
      Shwartz, Vered",
    booktitle = "Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.repl4nlp-1.20/",
    doi = "10.18653/v1/2021.repl4nlp-1.20",
    pages = "195--205",
    abstract = "We investigate how sentence-level transformers can be modified into effective sequence labelers at the token level without any direct supervision. Existing approaches to zero-shot sequence labeling do not perform well when applied on transformer-based architectures. As transformers contain multiple layers of multi-head self-attention, information in the sentence gets distributed between many tokens, negatively affecting zero-shot token-level performance. We find that a soft attention module which explicitly encourages sharpness of attention weights can significantly outperform existing methods."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="bujel-etal-2021-zero">
    <titleInfo>
        <title>Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Kamil</namePart>
        <namePart type="family">Bujel</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Helen</namePart>
        <namePart type="family">Yannakoudakis</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Marek</namePart>
        <namePart type="family">Rei</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2021-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Anna</namePart>
            <namePart type="family">Rogers</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Iacer</namePart>
            <namePart type="family">Calixto</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Ivan</namePart>
            <namePart type="family">Vulić</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Naomi</namePart>
            <namePart type="family">Saphra</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Nora</namePart>
            <namePart type="family">Kassner</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Oana-Maria</namePart>
            <namePart type="family">Camburu</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Trapit</namePart>
            <namePart type="family">Bansal</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Vered</namePart>
            <namePart type="family">Shwartz</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Online</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>We investigate how sentence-level transformers can be modified into effective sequence labelers at the token level without any direct supervision. Existing approaches to zero-shot sequence labeling do not perform well when applied on transformer-based architectures. As transformers contain multiple layers of multi-head self-attention, information in the sentence gets distributed between many tokens, negatively affecting zero-shot token-level performance. We find that a soft attention module which explicitly encourages sharpness of attention weights can significantly outperform existing methods.</abstract>
    <identifier type="citekey">bujel-etal-2021-zero</identifier>
    <identifier type="doi">10.18653/v1/2021.repl4nlp-1.20</identifier>
    <location>
        <url>https://aclanthology.org/2021.repl4nlp-1.20/</url>
    </location>
    <part>
        <date>2021-08</date>
        <extent unit="page">
            <start>195</start>
            <end>205</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers
%A Bujel, Kamil
%A Yannakoudakis, Helen
%A Rei, Marek
%Y Rogers, Anna
%Y Calixto, Iacer
%Y Vulić, Ivan
%Y Saphra, Naomi
%Y Kassner, Nora
%Y Camburu, Oana-Maria
%Y Bansal, Trapit
%Y Shwartz, Vered
%S Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021)
%D 2021
%8 August
%I Association for Computational Linguistics
%C Online
%F bujel-etal-2021-zero
%X We investigate how sentence-level transformers can be modified into effective sequence labelers at the token level without any direct supervision. Existing approaches to zero-shot sequence labeling do not perform well when applied on transformer-based architectures. As transformers contain multiple layers of multi-head self-attention, information in the sentence gets distributed between many tokens, negatively affecting zero-shot token-level performance. We find that a soft attention module which explicitly encourages sharpness of attention weights can significantly outperform existing methods.
%R 10.18653/v1/2021.repl4nlp-1.20
%U https://aclanthology.org/2021.repl4nlp-1.20/
%U https://doi.org/10.18653/v1/2021.repl4nlp-1.20
%P 195-205

Download as File

Markdown (Informal)

[Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers](https://aclanthology.org/2021.repl4nlp-1.20/) (Bujel et al., RepL4NLP 2021)

Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers (Bujel et al., RepL4NLP 2021)

ACL

Kamil Bujel, Helen Yannakoudakis, and Marek Rei. 2021. Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), pages 195–205, Online. Association for Computational Linguistics.