Inverted Projection for Robust Speech Translation

Dirk Padfield; Colin Cherry

doi:10.18653/v1/2021.iwslt-1.28

Inverted Projection for Robust Speech Translation

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Traditional translation systems trained on written documents perform well for text-based translation but not as well for speech-based applications. We aim to adapt translation models to speech by introducing actual lexical errors from ASR and segmentation errors from automatic punctuation into our translation training data. We introduce an inverted projection approach that projects automatically detected system segments onto human transcripts and then re-segments the gold translations to align with the projected human transcripts. We demonstrate that this overcomes the train-test mismatch present in other training approaches. The new projection approach achieves gains of over 1 BLEU point over a baseline that is exposed to the human transcripts and segmentations, and these gains hold for both IWSLT data and YouTube data.

Anthology ID:: 2021.iwslt-1.28
Volume:: Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Month:: August
Year:: 2021
Address:: Bangkok, Thailand (online)
Editors:: Marcello Federico, Alex Waibel, Marta R. Costa-jussà, Jan Niehues, Sebastian Stuker, Elizabeth Salesky
Venue:: IWSLT
SIG:: SIGSLT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 236–244
Language:
URL:: https://aclanthology.org/2021.iwslt-1.28/
DOI:: 10.18653/v1/2021.iwslt-1.28
Bibkey:
Cite (ACL):: Dirk Padfield and Colin Cherry. 2021. Inverted Projection for Robust Speech Translation. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 236–244, Bangkok, Thailand (online). Association for Computational Linguistics.
Cite (Informal):: Inverted Projection for Robust Speech Translation (Padfield & Cherry, IWSLT 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.iwslt-1.28.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{padfield-cherry-2021-inverted,
    title = "Inverted Projection for Robust Speech Translation",
    author = "Padfield, Dirk  and
      Cherry, Colin",
    editor = "Federico, Marcello  and
      Waibel, Alex  and
      Costa-juss{\`a}, Marta R.  and
      Niehues, Jan  and
      Stuker, Sebastian  and
      Salesky, Elizabeth",
    booktitle = "Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)",
    month = aug,
    year = "2021",
    address = "Bangkok, Thailand (online)",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.iwslt-1.28/",
    doi = "10.18653/v1/2021.iwslt-1.28",
    pages = "236--244",
    abstract = "Traditional translation systems trained on written documents perform well for text-based translation but not as well for speech-based applications. We aim to adapt translation models to speech by introducing actual lexical errors from ASR and segmentation errors from automatic punctuation into our translation training data. We introduce an inverted projection approach that projects automatically detected system segments onto human transcripts and then re-segments the gold translations to align with the projected human transcripts. We demonstrate that this overcomes the train-test mismatch present in other training approaches. The new projection approach achieves gains of over 1 BLEU point over a baseline that is exposed to the human transcripts and segmentations, and these gains hold for both IWSLT data and YouTube data."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="padfield-cherry-2021-inverted">
    <titleInfo>
        <title>Inverted Projection for Robust Speech Translation</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Dirk</namePart>
        <namePart type="family">Padfield</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Colin</namePart>
        <namePart type="family">Cherry</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2021-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Marcello</namePart>
            <namePart type="family">Federico</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Alex</namePart>
            <namePart type="family">Waibel</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Marta</namePart>
            <namePart type="given">R</namePart>
            <namePart type="family">Costa-jussà</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jan</namePart>
            <namePart type="family">Niehues</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sebastian</namePart>
            <namePart type="family">Stuker</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Elizabeth</namePart>
            <namePart type="family">Salesky</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Bangkok, Thailand (online)</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Traditional translation systems trained on written documents perform well for text-based translation but not as well for speech-based applications. We aim to adapt translation models to speech by introducing actual lexical errors from ASR and segmentation errors from automatic punctuation into our translation training data. We introduce an inverted projection approach that projects automatically detected system segments onto human transcripts and then re-segments the gold translations to align with the projected human transcripts. We demonstrate that this overcomes the train-test mismatch present in other training approaches. The new projection approach achieves gains of over 1 BLEU point over a baseline that is exposed to the human transcripts and segmentations, and these gains hold for both IWSLT data and YouTube data.</abstract>
    <identifier type="citekey">padfield-cherry-2021-inverted</identifier>
    <identifier type="doi">10.18653/v1/2021.iwslt-1.28</identifier>
    <location>
        <url>https://aclanthology.org/2021.iwslt-1.28/</url>
    </location>
    <part>
        <date>2021-08</date>
        <extent unit="page">
            <start>236</start>
            <end>244</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Inverted Projection for Robust Speech Translation
%A Padfield, Dirk
%A Cherry, Colin
%Y Federico, Marcello
%Y Waibel, Alex
%Y Costa-jussà, Marta R.
%Y Niehues, Jan
%Y Stuker, Sebastian
%Y Salesky, Elizabeth
%S Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
%D 2021
%8 August
%I Association for Computational Linguistics
%C Bangkok, Thailand (online)
%F padfield-cherry-2021-inverted
%X Traditional translation systems trained on written documents perform well for text-based translation but not as well for speech-based applications. We aim to adapt translation models to speech by introducing actual lexical errors from ASR and segmentation errors from automatic punctuation into our translation training data. We introduce an inverted projection approach that projects automatically detected system segments onto human transcripts and then re-segments the gold translations to align with the projected human transcripts. We demonstrate that this overcomes the train-test mismatch present in other training approaches. The new projection approach achieves gains of over 1 BLEU point over a baseline that is exposed to the human transcripts and segmentations, and these gains hold for both IWSLT data and YouTube data.
%R 10.18653/v1/2021.iwslt-1.28
%U https://aclanthology.org/2021.iwslt-1.28/
%U https://doi.org/10.18653/v1/2021.iwslt-1.28
%P 236-244

Download as File

Markdown (Informal)

[Inverted Projection for Robust Speech Translation](https://aclanthology.org/2021.iwslt-1.28/) (Padfield & Cherry, IWSLT 2021)

Inverted Projection for Robust Speech Translation (Padfield & Cherry, IWSLT 2021)

ACL

Dirk Padfield and Colin Cherry. 2021. Inverted Projection for Robust Speech Translation. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 236–244, Bangkok, Thailand (online). Association for Computational Linguistics.