Language ID Prediction from Speech Using Self-Attentive Pooling

Roman Bedyakin; Nikolay Mikhaylovskiy

doi:10.18653/v1/2021.sigtyp-1.12

Language ID Prediction from Speech Using Self-Attentive Pooling

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on predicting language IDs from speech. Spoken Language Identification (LID) is an important step in a multilingual Automated Speech Recognition (ASR) system pipeline. For many low-resource and endangered languages, only single-speaker recordings may be available, demanding a need for domain and speaker-invariant language ID systems. In this memo, we show that a convolutional neural network with a Self-Attentive Pooling layer shows promising results for the language identification task.

Anthology ID:: 2021.sigtyp-1.12
Volume:: Proceedings of the Third Workshop on Computational Typology and Multilingual NLP
Month:: June
Year:: 2021
Address:: Online
Editors:: Ekaterina Vylomova, Elizabeth Salesky, Sabrina Mielke, Gabriella Lapesa, Ritesh Kumar, Harald Hammarström, Ivan Vulić, Anna Korhonen, Roi Reichart, Edoardo Maria Ponti, Ryan Cotterell
Venue:: SIGTYP
SIG:: SIGTYP
Publisher:: Association for Computational Linguistics
Note:
Pages:: 130–135
Language:
URL:: https://aclanthology.org/2021.sigtyp-1.12/
DOI:: 10.18653/v1/2021.sigtyp-1.12
Bibkey:
Cite (ACL):: Roman Bedyakin and Nikolay Mikhaylovskiy. 2021. Language ID Prediction from Speech Using Self-Attentive Pooling. In Proceedings of the Third Workshop on Computational Typology and Multilingual NLP, pages 130–135, Online. Association for Computational Linguistics.
Cite (Informal):: Language ID Prediction from Speech Using Self-Attentive Pooling (Bedyakin & Mikhaylovskiy, SIGTYP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.sigtyp-1.12.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{bedyakin-mikhaylovskiy-2021-language,
    title = "Language {ID} Prediction from Speech Using Self-Attentive Pooling",
    author = "Bedyakin, Roman  and
      Mikhaylovskiy, Nikolay",
    editor = {Vylomova, Ekaterina  and
      Salesky, Elizabeth  and
      Mielke, Sabrina  and
      Lapesa, Gabriella  and
      Kumar, Ritesh  and
      Hammarstr{\"o}m, Harald  and
      Vuli{\'c}, Ivan  and
      Korhonen, Anna  and
      Reichart, Roi  and
      Ponti, Edoardo Maria  and
      Cotterell, Ryan},
    booktitle = "Proceedings of the Third Workshop on Computational Typology and Multilingual NLP",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.sigtyp-1.12/",
    doi = "10.18653/v1/2021.sigtyp-1.12",
    pages = "130--135",
    abstract = "This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on predicting language IDs from speech. Spoken Language Identification (LID) is an important step in a multilingual Automated Speech Recognition (ASR) system pipeline. For many low-resource and endangered languages, only single-speaker recordings may be available, demanding a need for domain and speaker-invariant language ID systems. In this memo, we show that a convolutional neural network with a Self-Attentive Pooling layer shows promising results for the language identification task."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="bedyakin-mikhaylovskiy-2021-language">
    <titleInfo>
        <title>Language ID Prediction from Speech Using Self-Attentive Pooling</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Roman</namePart>
        <namePart type="family">Bedyakin</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Nikolay</namePart>
        <namePart type="family">Mikhaylovskiy</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2021-06</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Third Workshop on Computational Typology and Multilingual NLP</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Ekaterina</namePart>
            <namePart type="family">Vylomova</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Elizabeth</namePart>
            <namePart type="family">Salesky</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sabrina</namePart>
            <namePart type="family">Mielke</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Gabriella</namePart>
            <namePart type="family">Lapesa</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Ritesh</namePart>
            <namePart type="family">Kumar</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Harald</namePart>
            <namePart type="family">Hammarström</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Ivan</namePart>
            <namePart type="family">Vulić</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Anna</namePart>
            <namePart type="family">Korhonen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Roi</namePart>
            <namePart type="family">Reichart</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Edoardo</namePart>
            <namePart type="given">Maria</namePart>
            <namePart type="family">Ponti</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Ryan</namePart>
            <namePart type="family">Cotterell</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Online</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on predicting language IDs from speech. Spoken Language Identification (LID) is an important step in a multilingual Automated Speech Recognition (ASR) system pipeline. For many low-resource and endangered languages, only single-speaker recordings may be available, demanding a need for domain and speaker-invariant language ID systems. In this memo, we show that a convolutional neural network with a Self-Attentive Pooling layer shows promising results for the language identification task.</abstract>
    <identifier type="citekey">bedyakin-mikhaylovskiy-2021-language</identifier>
    <identifier type="doi">10.18653/v1/2021.sigtyp-1.12</identifier>
    <location>
        <url>https://aclanthology.org/2021.sigtyp-1.12/</url>
    </location>
    <part>
        <date>2021-06</date>
        <extent unit="page">
            <start>130</start>
            <end>135</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Language ID Prediction from Speech Using Self-Attentive Pooling
%A Bedyakin, Roman
%A Mikhaylovskiy, Nikolay
%Y Vylomova, Ekaterina
%Y Salesky, Elizabeth
%Y Mielke, Sabrina
%Y Lapesa, Gabriella
%Y Kumar, Ritesh
%Y Hammarström, Harald
%Y Vulić, Ivan
%Y Korhonen, Anna
%Y Reichart, Roi
%Y Ponti, Edoardo Maria
%Y Cotterell, Ryan
%S Proceedings of the Third Workshop on Computational Typology and Multilingual NLP
%D 2021
%8 June
%I Association for Computational Linguistics
%C Online
%F bedyakin-mikhaylovskiy-2021-language
%X This memo describes NTR-TSU submission for SIGTYP 2021 Shared Task on predicting language IDs from speech. Spoken Language Identification (LID) is an important step in a multilingual Automated Speech Recognition (ASR) system pipeline. For many low-resource and endangered languages, only single-speaker recordings may be available, demanding a need for domain and speaker-invariant language ID systems. In this memo, we show that a convolutional neural network with a Self-Attentive Pooling layer shows promising results for the language identification task.
%R 10.18653/v1/2021.sigtyp-1.12
%U https://aclanthology.org/2021.sigtyp-1.12/
%U https://doi.org/10.18653/v1/2021.sigtyp-1.12
%P 130-135

Download as File

Markdown (Informal)

[Language ID Prediction from Speech Using Self-Attentive Pooling](https://aclanthology.org/2021.sigtyp-1.12/) (Bedyakin & Mikhaylovskiy, SIGTYP 2021)

Language ID Prediction from Speech Using Self-Attentive Pooling (Bedyakin & Mikhaylovskiy, SIGTYP 2021)

ACL

Roman Bedyakin and Nikolay Mikhaylovskiy. 2021. Language ID Prediction from Speech Using Self-Attentive Pooling. In Proceedings of the Third Workshop on Computational Typology and Multilingual NLP, pages 130–135, Online. Association for Computational Linguistics.