Quick Rich Transcriptions of Arabic Broadcast News Speech Data

Chomicha Bendahman; Meghan Glenn; Djamel Mostefa; Niklas Paulsson; Stephanie Strassel

Quick Rich Transcriptions of Arabic Broadcast News Speech Data

Chomicha Bendahman, Meghan Glenn, Djamel Mostefa, Niklas Paulsson, Stephanie Strassel

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

This paper describes the collect and transcription of a large set of Arabic broadcast news speech data. A total of more than 2000 hours of data was transcribed. The transcription factor for transcribing the broadcast news data has been reduced using a method such as Quick Rich Transcription (QRTR) as well as reducing the number of quality controls performed on the data. The data was collected from several Arabic TV and radio sources and from both Modern Standard Arabic and dialectal Arabic. The orthographic transcriptions included segmentation, speaker turns, topics, sentence unit types and a minimal noise mark-up. The transcripts were produced as a part of the GALE project.

Anthology ID:: L08-1124
Volume:: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:: May
Year:: 2008
Address:: Marrakech, Morocco
Editors:: Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:: http://www.lrec-conf.org/proceedings/lrec2008/pdf/915_paper.pdf
DOI:
Bibkey:
Cite (ACL):: Chomicha Bendahman, Meghan Glenn, Djamel Mostefa, Niklas Paulsson, and Stephanie Strassel. 2008. Quick Rich Transcriptions of Arabic Broadcast News Speech Data. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):: Quick Rich Transcriptions of Arabic Broadcast News Speech Data (Bendahman et al., LREC 2008)
Copy Citation:
PDF:: http://www.lrec-conf.org/proceedings/lrec2008/pdf/915_paper.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{bendahman-etal-2008-quick,
    title = "Quick Rich Transcriptions of {A}rabic Broadcast News Speech Data",
    author = "Bendahman, Chomicha  and
      Glenn, Meghan  and
      Mostefa, Djamel  and
      Paulsson, Niklas  and
      Strassel, Stephanie",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Odijk, Jan  and
      Piperidis, Stelios  and
      Tapias, Daniel",
    booktitle = "Proceedings of the Sixth International Conference on Language Resources and Evaluation ({LREC}'08)",
    month = may,
    year = "2008",
    address = "Marrakech, Morocco",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://aclanthology.org/L08-1124/",
    abstract = "This paper describes the collect and transcription of a large set of Arabic broadcast news speech data. A total of more than 2000 hours of data was transcribed. The transcription factor for transcribing the broadcast news data has been reduced using a method such as Quick Rich Transcription (QRTR) as well as reducing the number of quality controls performed on the data. The data was collected from several Arabic TV and radio sources and from both Modern Standard Arabic and dialectal Arabic. The orthographic transcriptions included segmentation, speaker turns, topics, sentence unit types and a minimal noise mark-up. The transcripts were produced as a part of the GALE project."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="bendahman-etal-2008-quick">
    <titleInfo>
        <title>Quick Rich Transcriptions of Arabic Broadcast News Speech Data</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Chomicha</namePart>
        <namePart type="family">Bendahman</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Meghan</namePart>
        <namePart type="family">Glenn</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Djamel</namePart>
        <namePart type="family">Mostefa</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Niklas</namePart>
        <namePart type="family">Paulsson</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Stephanie</namePart>
        <namePart type="family">Strassel</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2008-05</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08)</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Nicoletta</namePart>
            <namePart type="family">Calzolari</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Khalid</namePart>
            <namePart type="family">Choukri</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Bente</namePart>
            <namePart type="family">Maegaard</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Joseph</namePart>
            <namePart type="family">Mariani</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Jan</namePart>
            <namePart type="family">Odijk</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Stelios</namePart>
            <namePart type="family">Piperidis</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Daniel</namePart>
            <namePart type="family">Tapias</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>European Language Resources Association (ELRA)</publisher>
            <place>
                <placeTerm type="text">Marrakech, Morocco</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>This paper describes the collect and transcription of a large set of Arabic broadcast news speech data. A total of more than 2000 hours of data was transcribed. The transcription factor for transcribing the broadcast news data has been reduced using a method such as Quick Rich Transcription (QRTR) as well as reducing the number of quality controls performed on the data. The data was collected from several Arabic TV and radio sources and from both Modern Standard Arabic and dialectal Arabic. The orthographic transcriptions included segmentation, speaker turns, topics, sentence unit types and a minimal noise mark-up. The transcripts were produced as a part of the GALE project.</abstract>
    <identifier type="citekey">bendahman-etal-2008-quick</identifier>
    <location>
        <url>https://aclanthology.org/L08-1124/</url>
    </location>
    <part>
        <date>2008-05</date>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Quick Rich Transcriptions of Arabic Broadcast News Speech Data
%A Bendahman, Chomicha
%A Glenn, Meghan
%A Mostefa, Djamel
%A Paulsson, Niklas
%A Strassel, Stephanie
%Y Calzolari, Nicoletta
%Y Choukri, Khalid
%Y Maegaard, Bente
%Y Mariani, Joseph
%Y Odijk, Jan
%Y Piperidis, Stelios
%Y Tapias, Daniel
%S Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08)
%D 2008
%8 May
%I European Language Resources Association (ELRA)
%C Marrakech, Morocco
%F bendahman-etal-2008-quick
%X This paper describes the collect and transcription of a large set of Arabic broadcast news speech data. A total of more than 2000 hours of data was transcribed. The transcription factor for transcribing the broadcast news data has been reduced using a method such as Quick Rich Transcription (QRTR) as well as reducing the number of quality controls performed on the data. The data was collected from several Arabic TV and radio sources and from both Modern Standard Arabic and dialectal Arabic. The orthographic transcriptions included segmentation, speaker turns, topics, sentence unit types and a minimal noise mark-up. The transcripts were produced as a part of the GALE project.
%U https://aclanthology.org/L08-1124/

Download as File

Markdown (Informal)

[Quick Rich Transcriptions of Arabic Broadcast News Speech Data](https://aclanthology.org/L08-1124/) (Bendahman et al., LREC 2008)

Quick Rich Transcriptions of Arabic Broadcast News Speech Data (Bendahman et al., LREC 2008)

ACL

Chomicha Bendahman, Meghan Glenn, Djamel Mostefa, Niklas Paulsson, and Stephanie Strassel. 2008. Quick Rich Transcriptions of Arabic Broadcast News Speech Data. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).