@inproceedings{hedderich-klakow-2018-training,
    title = "Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data",
    author = "Hedderich, Michael A.  and
      Klakow, Dietrich",
    editor = "Haffari, Reza  and
      Cherry, Colin  and
      Foster, George  and
      Khadivi, Shahram  and
      Salehi, Bahar",
    booktitle = "Proceedings of the Workshop on Deep Learning Approaches for Low-Resource {NLP}",
    month = jul,
    year = "2018",
    address = "Melbourne",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W18-3402/",
    doi = "10.18653/v1/W18-3402",
    pages = "12--18",
    abstract = "Manually labeled corpora are expensive to create and often not available for low-resource languages or domains. Automatic labeling approaches are an alternative way to obtain labeled data in a quicker and cheaper way. However, these labels often contain more errors which can deteriorate a classifier{'}s performance when trained on this data. We propose a noise layer that is added to a neural network architecture. This allows modeling the noise and train on a combination of clean and noisy data. We show that in a low-resource NER task we can improve performance by up to 35{\%} by using additional, noisy data and handling the noise."
}<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="hedderich-klakow-2018-training">
    <titleInfo>
        <title>Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data</title>
    </titleInfo>
    <name type="personal">
        <namePart type="given">Michael</namePart>
        <namePart type="given">A</namePart>
        <namePart type="family">Hedderich</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <name type="personal">
        <namePart type="given">Dietrich</namePart>
        <namePart type="family">Klakow</namePart>
        <role>
            <roleTerm authority="marcrelator" type="text">author</roleTerm>
        </role>
    </name>
    <originInfo>
        <dateIssued>2018-07</dateIssued>
    </originInfo>
    <typeOfResource>text</typeOfResource>
    <relatedItem type="host">
        <titleInfo>
            <title>Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP</title>
        </titleInfo>
        <name type="personal">
            <namePart type="given">Reza</namePart>
            <namePart type="family">Haffari</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Colin</namePart>
            <namePart type="family">Cherry</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">George</namePart>
            <namePart type="family">Foster</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Shahram</namePart>
            <namePart type="family">Khadivi</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <name type="personal">
            <namePart type="given">Bahar</namePart>
            <namePart type="family">Salehi</namePart>
            <role>
                <roleTerm authority="marcrelator" type="text">editor</roleTerm>
            </role>
        </name>
        <originInfo>
            <publisher>Association for Computational Linguistics</publisher>
            <place>
                <placeTerm type="text">Melbourne</placeTerm>
            </place>
        </originInfo>
        <genre authority="marcgt">conference publication</genre>
    </relatedItem>
    <abstract>Manually labeled corpora are expensive to create and often not available for low-resource languages or domains. Automatic labeling approaches are an alternative way to obtain labeled data in a quicker and cheaper way. However, these labels often contain more errors which can deteriorate a classifier’s performance when trained on this data. We propose a noise layer that is added to a neural network architecture. This allows modeling the noise and train on a combination of clean and noisy data. We show that in a low-resource NER task we can improve performance by up to 35% by using additional, noisy data and handling the noise.</abstract>
    <identifier type="citekey">hedderich-klakow-2018-training</identifier>
    <identifier type="doi">10.18653/v1/W18-3402</identifier>
    <location>
        <url>https://aclanthology.org/W18-3402/</url>
    </location>
    <part>
        <date>2018-07</date>
        <extent unit="page">
            <start>12</start>
            <end>18</end>
        </extent>
    </part>
</mods>
</modsCollection>
%0 Conference Proceedings
%T Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data
%A Hedderich, Michael A.
%A Klakow, Dietrich
%Y Haffari, Reza
%Y Cherry, Colin
%Y Foster, George
%Y Khadivi, Shahram
%Y Salehi, Bahar
%S Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP
%D 2018
%8 July
%I Association for Computational Linguistics
%C Melbourne
%F hedderich-klakow-2018-training
%X Manually labeled corpora are expensive to create and often not available for low-resource languages or domains. Automatic labeling approaches are an alternative way to obtain labeled data in a quicker and cheaper way. However, these labels often contain more errors which can deteriorate a classifier’s performance when trained on this data. We propose a noise layer that is added to a neural network architecture. This allows modeling the noise and train on a combination of clean and noisy data. We show that in a low-resource NER task we can improve performance by up to 35% by using additional, noisy data and handling the noise.
%R 10.18653/v1/W18-3402
%U https://aclanthology.org/W18-3402/
%U https://doi.org/10.18653/v1/W18-3402
%P 12-18
Markdown (Informal)
[Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data](https://aclanthology.org/W18-3402/) (Hedderich & Klakow, ACL 2018)
ACL