@inproceedings{moore-etal-2016-automated,
    title = "Automated speech-unit delimitation in spoken learner {E}nglish",
    author = "Moore, Russell  and
      Caines, Andrew  and
      Graham, Calbert  and
      Buttery, Paula",
    editor = "Matsumoto, Yuji  and
      Prasad, Rashmi",
    booktitle = "Proceedings of {COLING} 2016, the 26th International Conference on Computational Linguistics: Technical Papers",
    month = dec,
    year = "2016",
    address = "Osaka, Japan",
    publisher = "The COLING 2016 Organizing Committee",
    url = "https://aclanthology.org/C16-1075/",
    pages = "782--793",
    abstract = "In order to apply computational linguistic analyses and pass information to downstream applications, transcriptions of speech obtained via automatic speech recognition (ASR) need to be divided into smaller meaningful units, in a task we refer to as `speech-unit (SU) delimitation'. We closely recreate the automatic delimitation system described by Lee and Glass (2012), `Sentence detection using multiple annotations', Proceedings of INTERSPEECH, which combines a prosodic model, language model and speech-unit length model in log-linear fashion. Since state-of-the-art natural language processing (NLP) tools have been developed to deal with written text and its characteristic sentence-like units, SU delimitation helps bridge the gap between ASR and NLP, by normalising spoken data into a more canonical format. Previous work has focused on native speaker recordings; we test the system of Lee and Glass (2012) on non-native speaker (or `learner') data, achieving performance above the state-of-the-art. We also consider alternative evaluation metrics which move away from the idea of a single `truth' in SU delimitation, and frame this work in the context of downstream NLP applications."
}<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="moore-etal-2016-automated">
    <titleInfo>
        <title>Automated speech-unit delimitation in spoken learner English</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Russell</namePart>
        <namePart type="family">Moore</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Andrew</namePart>
        <namePart type="family">Caines</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Calbert</namePart>
        <namePart type="family">Graham</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Paula</namePart>
        <namePart type="family">Buttery</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2016-12</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Yuji</namePart>
            <namePart type="family">Matsumoto</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Rashmi</namePart>
            <namePart type="family">Prasad</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>The COLING 2016 Organizing Committee</publisher>
            <place>
                <placeTerm type="text">Osaka, Japan</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>In order to apply computational linguistic analyses and pass information to downstream applications, transcriptions of speech obtained via automatic speech recognition (ASR) need to be divided into smaller meaningful units, in a task we refer to as ‘speech-unit (SU) delimitation’. We closely recreate the automatic delimitation system described by Lee and Glass (2012), ‘Sentence detection using multiple annotations’, Proceedings of INTERSPEECH, which combines a prosodic model, language model and speech-unit length model in log-linear fashion. Since state-of-the-art natural language processing (NLP) tools have been developed to deal with written text and its characteristic sentence-like units, SU delimitation helps bridge the gap between ASR and NLP, by normalising spoken data into a more canonical format. Previous work has focused on native speaker recordings; we test the system of Lee and Glass (2012) on non-native speaker (or ‘learner’) data, achieving performance above the state-of-the-art. We also consider alternative evaluation metrics which move away from the idea of a single ‘truth’ in SU delimitation, and frame this work in the context of downstream NLP applications.</abstract>
    <identifier type="citekey">moore-etal-2016-automated</identifier>
    <location>
        <url>https://aclanthology.org/C16-1075/</url>
    </location>
    <part>
        <date>2016-12</date>
        <extent unit="page">
            <start>782</start>
            <end>793</end>
        </extent>
    </part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T Automated speech-unit delimitation in spoken learner English
%A Moore, Russell
%A Caines, Andrew
%A Graham, Calbert
%A Buttery, Paula
%Y Matsumoto, Yuji
%Y Prasad, Rashmi
%S Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
%D 2016
%8 December
%I The COLING 2016 Organizing Committee
%C Osaka, Japan
%F moore-etal-2016-automated
%X In order to apply computational linguistic analyses and pass information to downstream applications, transcriptions of speech obtained via automatic speech recognition (ASR) need to be divided into smaller meaningful units, in a task we refer to as ‘speech-unit (SU) delimitation’. We closely recreate the automatic delimitation system described by Lee and Glass (2012), ‘Sentence detection using multiple annotations’, Proceedings of INTERSPEECH, which combines a prosodic model, language model and speech-unit length model in log-linear fashion. Since state-of-the-art natural language processing (NLP) tools have been developed to deal with written text and its characteristic sentence-like units, SU delimitation helps bridge the gap between ASR and NLP, by normalising spoken data into a more canonical format. Previous work has focused on native speaker recordings; we test the system of Lee and Glass (2012) on non-native speaker (or ‘learner’) data, achieving performance above the state-of-the-art. We also consider alternative evaluation metrics which move away from the idea of a single ‘truth’ in SU delimitation, and frame this work in the context of downstream NLP applications.
%U https://aclanthology.org/C16-1075/
%P 782-793
Markdown (Informal)
[Automated speech-unit delimitation in spoken learner English](https://aclanthology.org/C16-1075/) (Moore et al., COLING 2016)
ACL
- Russell Moore, Andrew Caines, Calbert Graham, and Paula Buttery. 2016. Automated speech-unit delimitation in spoken learner English. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 782–793, Osaka, Japan. The COLING 2016 Organizing Committee.