Detecting Extraneous Content in Podcasts

Sravana Reddy; Yongze Yu; Aasish Pappu; Aswin Sivaraman; Rezvaneh Rezapour; Rosie Jones

doi:10.18653/v1/2021.eacl-main.99

Detecting Extraneous Content in Podcasts

Sravana Reddy, Yongze Yu, Aasish Pappu, Aswin Sivaraman, Rezvaneh Rezapour, Rosie Jones

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

Podcast episodes often contain material extraneous to the main content, such as advertisements, interleaved within the audio and the written descriptions. We present classifiers that leverage both textual and listening patterns in order to detect such content in podcast descriptions and audio transcripts. We demonstrate that our models are effective by evaluating them on the downstream task of podcast summarization and show that we can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries.

Anthology ID:: 2021.eacl-main.99
Volume:: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:: April
Year:: 2021
Address:: Online
Editors:: Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1166–1173
Language:
URL:: https://aclanthology.org/2021.eacl-main.99/
DOI:: 10.18653/v1/2021.eacl-main.99
Bibkey:
Cite (ACL):: Sravana Reddy, Yongze Yu, Aasish Pappu, Aswin Sivaraman, Rezvaneh Rezapour, and Rosie Jones. 2021. Detecting Extraneous Content in Podcasts. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1166–1173, Online. Association for Computational Linguistics.
Cite (Informal):: Detecting Extraneous Content in Podcasts (Reddy et al., EACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.eacl-main.99.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{reddy-etal-2021-detecting,
    title = "Detecting Extraneous Content in Podcasts",
    author = "Reddy, Sravana  and
      Yu, Yongze  and
      Pappu, Aasish  and
      Sivaraman, Aswin  and
      Rezapour, Rezvaneh  and
      Jones, Rosie",
    editor = "Merlo, Paola  and
      Tiedemann, Jorg  and
      Tsarfaty, Reut",
    booktitle = "Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
    month = apr,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.eacl-main.99/",
    doi = "10.18653/v1/2021.eacl-main.99",
    pages = "1166--1173",
    abstract = "Podcast episodes often contain material extraneous to the main content, such as advertisements, interleaved within the audio and the written descriptions. We present classifiers that leverage both textual and listening patterns in order to detect such content in podcast descriptions and audio transcripts. We demonstrate that our models are effective by evaluating them on the downstream task of podcast summarization and show that we can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="reddy-etal-2021-detecting">
    <titleInfo>
        <title>Detecting Extraneous Content in Podcasts</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Sravana</namePart>
        <namePart type="family">Reddy</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Yongze</namePart>
        <namePart type="family">Yu</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Aasish</namePart>
        <namePart type="family">Pappu</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Aswin</namePart>
        <namePart type="family">Sivaraman</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Rezvaneh</namePart>
        <namePart type="family">Rezapour</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Rosie</namePart>
        <namePart type="family">Jones</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2021-04</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Paola</namePart>
            <namePart type="family">Merlo</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jorg</namePart>
            <namePart type="family">Tiedemann</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Reut</namePart>
            <namePart type="family">Tsarfaty</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Online</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Podcast episodes often contain material extraneous to the main content, such as advertisements, interleaved within the audio and the written descriptions. We present classifiers that leverage both textual and listening patterns in order to detect such content in podcast descriptions and audio transcripts. We demonstrate that our models are effective by evaluating them on the downstream task of podcast summarization and show that we can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries.</abstract>
    <identifier type="citekey">reddy-etal-2021-detecting</identifier>
    <identifier type="doi">10.18653/v1/2021.eacl-main.99</identifier>
    <location>
        <url>https://aclanthology.org/2021.eacl-main.99/</url>
    </location>
    <part>
        <date>2021-04</date>
        <extent unit="page">
            <start>1166</start>
            <end>1173</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Detecting Extraneous Content in Podcasts
%A Reddy, Sravana
%A Yu, Yongze
%A Pappu, Aasish
%A Sivaraman, Aswin
%A Rezapour, Rezvaneh
%A Jones, Rosie
%Y Merlo, Paola
%Y Tiedemann, Jorg
%Y Tsarfaty, Reut
%S Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
%D 2021
%8 April
%I Association for Computational Linguistics
%C Online
%F reddy-etal-2021-detecting
%X Podcast episodes often contain material extraneous to the main content, such as advertisements, interleaved within the audio and the written descriptions. We present classifiers that leverage both textual and listening patterns in order to detect such content in podcast descriptions and audio transcripts. We demonstrate that our models are effective by evaluating them on the downstream task of podcast summarization and show that we can substantively improve ROUGE scores and reduce the extraneous content generated in the summaries.
%R 10.18653/v1/2021.eacl-main.99
%U https://aclanthology.org/2021.eacl-main.99/
%U https://doi.org/10.18653/v1/2021.eacl-main.99
%P 1166-1173

Download as File

Markdown (Informal)

[Detecting Extraneous Content in Podcasts](https://aclanthology.org/2021.eacl-main.99/) (Reddy et al., EACL 2021)

Detecting Extraneous Content in Podcasts (Reddy et al., EACL 2021)

ACL

Sravana Reddy, Yongze Yu, Aasish Pappu, Aswin Sivaraman, Rezvaneh Rezapour, and Rosie Jones. 2021. Detecting Extraneous Content in Podcasts. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1166–1173, Online. Association for Computational Linguistics.