Deep Learning for Punctuation Restoration in Medical Reports

Wael Salloum; Gregory Finley; Erik Edwards; Mark Miller; David Suendermann-Oeft

doi:10.18653/v1/W17-2319

Deep Learning for Punctuation Restoration in Medical Reports

Wael Salloum, Greg Finley, Erik Edwards, Mark Miller, David Suendermann-Oeft

Correct Metadata for

Use this form to create a GitHub issue with structured data describing the correction. You will need a GitHub account. Once you create that issue, the correction will be reviewed by a staff member.

⚠️ Mobile Users: Submitting this form to create a new issue will only work with github.com, not the GitHub Mobile app.

Important: The Anthology treat PDFs as authoritative. Please use this form only to correct data that is out of line with the PDF. See our corrections guidelines if you need to change the PDF.

Title Adjust the title. Retain tags such as <fixed-case>.

Authors Adjust author names and order to match the PDF.

Abstract Correct abstract if needed. Retain XML formatting tags such as <tex-math>. You may use <b>...</b> for bold, <i>...</i> for italic, and <url>...</url> for URLs.

Verification against PDF Ensure that the new title/authors match the snapshot below. (If there is no snapshot or it is too small, consult the PDF.)

Authors concatenated from the text boxes above:

ALL author names match the snapshot above—including middle initials, hyphens, and accents.

Abstract

In clinical dictation, speakers try to be as concise as possible to save time, often resulting in utterances without explicit punctuation commands. Since the end product of a dictated report, e.g. an out-patient letter, does require correct orthography, including exact punctuation, the latter need to be restored, preferably by automated means. This paper describes a method for punctuation restoration based on a state-of-the-art stack of NLP and machine learning techniques including B-RNNs with an attention mechanism and late fusion, as well as a feature extraction technique tailored to the processing of medical terminology using a novel vocabulary reduction model. To the best of our knowledge, the resulting performance is superior to that reported in prior art on similar tasks.

Anthology ID:: W17-2319
Volume:: Proceedings of the 16th BioNLP Workshop
Month:: August
Year:: 2017
Address:: Vancouver, Canada,
Editors:: Kevin Bretonnel Cohen, Dina Demner-Fushman, Sophia Ananiadou, Junichi Tsujii
Venue:: BioNLP
SIG:: SIGBIOMED
Publisher:: Association for Computational Linguistics
Note:
Pages:: 159–164
Language:
URL:: https://aclanthology.org/W17-2319/
DOI:: 10.18653/v1/W17-2319
Bibkey:
Cite (ACL):: Wael Salloum, Greg Finley, Erik Edwards, Mark Miller, and David Suendermann-Oeft. 2017. Deep Learning for Punctuation Restoration in Medical Reports. In Proceedings of the 16th BioNLP Workshop, pages 159–164, Vancouver, Canada,. Association for Computational Linguistics.
Cite (Informal):: Deep Learning for Punctuation Restoration in Medical Reports (Salloum et al., BioNLP 2017)
Copy Citation:
PDF:: https://aclanthology.org/W17-2319.pdf

PDF Cite Search Fix data

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{salloum-etal-2017-deep,
    title = "Deep Learning for Punctuation Restoration in Medical Reports",
    author = "Salloum, Wael  and
      Finley, Greg  and
      Edwards, Erik  and
      Miller, Mark  and
      Suendermann-Oeft, David",
    editor = "Cohen, Kevin Bretonnel  and
      Demner-Fushman, Dina  and
      Ananiadou, Sophia  and
      Tsujii, Junichi",
    booktitle = "Proceedings of the 16th {B}io{NLP} Workshop",
    month = aug,
    year = "2017",
    address = "Vancouver, Canada,",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W17-2319/",
    doi = "10.18653/v1/W17-2319",
    pages = "159--164",
    abstract = "In clinical dictation, speakers try to be as concise as possible to save time, often resulting in utterances without explicit punctuation commands. Since the end product of a dictated report, e.g. an out-patient letter, does require correct orthography, including exact punctuation, the latter need to be restored, preferably by automated means. This paper describes a method for punctuation restoration based on a state-of-the-art stack of NLP and machine learning techniques including B-RNNs with an attention mechanism and late fusion, as well as a feature extraction technique tailored to the processing of medical terminology using a novel vocabulary reduction model. To the best of our knowledge, the resulting performance is superior to that reported in prior art on similar tasks."
}

Download as File

<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="salloum-etal-2017-deep">
    <titleInfo>
        <title>Deep Learning for Punctuation Restoration in Medical Reports</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Wael</namePart>
        <namePart type="family">Salloum</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Greg</namePart>
        <namePart type="family">Finley</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Erik</namePart>
        <namePart type="family">Edwards</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Mark</namePart>
        <namePart type="family">Miller</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">David</namePart>
        <namePart type="family">Suendermann-Oeft</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2017-08</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the 16th BioNLP Workshop</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Kevin</namePart>
            <namePart type="given">Bretonnel</namePart>
            <namePart type="family">Cohen</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Dina</namePart>
            <namePart type="family">Demner-Fushman</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Sophia</namePart>
            <namePart type="family">Ananiadou</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Junichi</namePart>
            <namePart type="family">Tsujii</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Vancouver, Canada,</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>In clinical dictation, speakers try to be as concise as possible to save time, often resulting in utterances without explicit punctuation commands. Since the end product of a dictated report, e.g. an out-patient letter, does require correct orthography, including exact punctuation, the latter need to be restored, preferably by automated means. This paper describes a method for punctuation restoration based on a state-of-the-art stack of NLP and machine learning techniques including B-RNNs with an attention mechanism and late fusion, as well as a feature extraction technique tailored to the processing of medical terminology using a novel vocabulary reduction model. To the best of our knowledge, the resulting performance is superior to that reported in prior art on similar tasks.</abstract>
    <identifier type="citekey">salloum-etal-2017-deep</identifier>
    <identifier type="doi">10.18653/v1/W17-2319</identifier>
    <location>
        <url>https://aclanthology.org/W17-2319/</url>
    </location>
    <part>
        <date>2017-08</date>
        <extent unit="page">
            <start>159</start>
            <end>164</end>
        </extent>
    </part>
</mods>
</modsCollection>

Download as File

%0 Conference Proceedings
%T Deep Learning for Punctuation Restoration in Medical Reports
%A Salloum, Wael
%A Finley, Greg
%A Edwards, Erik
%A Miller, Mark
%A Suendermann-Oeft, David
%Y Cohen, Kevin Bretonnel
%Y Demner-Fushman, Dina
%Y Ananiadou, Sophia
%Y Tsujii, Junichi
%S Proceedings of the 16th BioNLP Workshop
%D 2017
%8 August
%I Association for Computational Linguistics
%C Vancouver, Canada,
%F salloum-etal-2017-deep
%X In clinical dictation, speakers try to be as concise as possible to save time, often resulting in utterances without explicit punctuation commands. Since the end product of a dictated report, e.g. an out-patient letter, does require correct orthography, including exact punctuation, the latter need to be restored, preferably by automated means. This paper describes a method for punctuation restoration based on a state-of-the-art stack of NLP and machine learning techniques including B-RNNs with an attention mechanism and late fusion, as well as a feature extraction technique tailored to the processing of medical terminology using a novel vocabulary reduction model. To the best of our knowledge, the resulting performance is superior to that reported in prior art on similar tasks.
%R 10.18653/v1/W17-2319
%U https://aclanthology.org/W17-2319/
%U https://doi.org/10.18653/v1/W17-2319
%P 159-164

Download as File

Markdown (Informal)

[Deep Learning for Punctuation Restoration in Medical Reports](https://aclanthology.org/W17-2319/) (Salloum et al., BioNLP 2017)

Deep Learning for Punctuation Restoration in Medical Reports (Salloum et al., BioNLP 2017)

ACL

Wael Salloum, Greg Finley, Erik Edwards, Mark Miller, and David Suendermann-Oeft. 2017. Deep Learning for Punctuation Restoration in Medical Reports. In Proceedings of the 16th BioNLP Workshop, pages 159–164, Vancouver, Canada,. Association for Computational Linguistics.