@article{rozovskaya-etal-2017-adapting,
title = "Adapting to Learner Errors with Minimal Supervision",
author = "Rozovskaya, Alla and
Roth, Dan and
Sammons, Mark",
journal = "Computational Linguistics",
volume = "43",
number = "4",
month = dec,
year = "2017",
address = "Cambridge, MA",
publisher = "MIT Press",
url = "https://aclanthology.org/J17-4002",
doi = "10.1162/COLI_a_00299",
pages = "723--760",
abstract = "This article considers the problem of correcting errors made by English as a Second Language writers from a machine learning perspective, and addresses an important issue of developing an appropriate training paradigm for the task, one that accounts for error patterns of non-native writers using minimal supervision. Existing training approaches present a trade-off between large amounts of cheap data offered by the native-trained models and additional knowledge of learner error patterns provided by the more expensive method of training on annotated learner data. We propose a novel training approach that draws on the strengths offered by the two standard training paradigms{---}of training either on native or on annotated learner data{---}and that outperforms both of these standard methods. Using the key observation that parameters relating to error regularities exhibited by non-native writers are relatively simple, we develop models that can incorporate knowledge about error regularities based on a small annotated sample but that are otherwise trained on native English data. The key contribution of this article is the introduction and analysis of two methods for adapting the learned models to error patterns of non-native writers; one method that applies to generative classifiers and a second that applies to discriminative classifiers. Both methods demonstrated state-of-the-art performance in several text correction competitions. In particular, the Illinois system that implements these methods ranked at the top in two recent CoNLL shared tasks on error correction.1 We conduct further evaluation of the proposed approaches studying the effect of using error data from speakers of the same native language, languages that are closely related linguistically, and unrelated languages.",
}
<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="rozovskaya-etal-2017-adapting">
<titleInfo>
<title>Adapting to Learner Errors with Minimal Supervision</title>
</titleInfo>
<name type="personal">
<namePart type="given">Alla</namePart>
<namePart type="family">Rozovskaya</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Dan</namePart>
<namePart type="family">Roth</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Mark</namePart>
<namePart type="family">Sammons</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<originInfo>
<dateIssued>2017-12</dateIssued>
</originInfo>
<typeOfResource>text</typeOfResource>
<genre authority="bibutilsgt">journal article</genre>
<relatedItem type="host">
<titleInfo>
<title>Computational Linguistics</title>
</titleInfo>
<originInfo>
<issuance>continuing</issuance>
<publisher>MIT Press</publisher>
<place>
<placeTerm type="text">Cambridge, MA</placeTerm>
</place>
</originInfo>
<genre authority="marcgt">periodical</genre>
<genre authority="bibutilsgt">academic journal</genre>
</relatedItem>
<abstract>This article considers the problem of correcting errors made by English as a Second Language writers from a machine learning perspective, and addresses an important issue of developing an appropriate training paradigm for the task, one that accounts for error patterns of non-native writers using minimal supervision. Existing training approaches present a trade-off between large amounts of cheap data offered by the native-trained models and additional knowledge of learner error patterns provided by the more expensive method of training on annotated learner data. We propose a novel training approach that draws on the strengths offered by the two standard training paradigms—of training either on native or on annotated learner data—and that outperforms both of these standard methods. Using the key observation that parameters relating to error regularities exhibited by non-native writers are relatively simple, we develop models that can incorporate knowledge about error regularities based on a small annotated sample but that are otherwise trained on native English data. The key contribution of this article is the introduction and analysis of two methods for adapting the learned models to error patterns of non-native writers; one method that applies to generative classifiers and a second that applies to discriminative classifiers. Both methods demonstrated state-of-the-art performance in several text correction competitions. In particular, the Illinois system that implements these methods ranked at the top in two recent CoNLL shared tasks on error correction.1 We conduct further evaluation of the proposed approaches studying the effect of using error data from speakers of the same native language, languages that are closely related linguistically, and unrelated languages.</abstract>
<identifier type="citekey">rozovskaya-etal-2017-adapting</identifier>
<identifier type="doi">10.1162/COLI_a_00299</identifier>
<location>
<url>https://aclanthology.org/J17-4002</url>
</location>
<part>
<date>2017-12</date>
<detail type="volume"><number>43</number></detail>
<detail type="issue"><number>4</number></detail>
<extent unit="page">
<start>723</start>
<end>760</end>
</extent>
</part>
</mods>
</modsCollection>
%0 Journal Article
%T Adapting to Learner Errors with Minimal Supervision
%A Rozovskaya, Alla
%A Roth, Dan
%A Sammons, Mark
%J Computational Linguistics
%D 2017
%8 December
%V 43
%N 4
%I MIT Press
%C Cambridge, MA
%F rozovskaya-etal-2017-adapting
%X This article considers the problem of correcting errors made by English as a Second Language writers from a machine learning perspective, and addresses an important issue of developing an appropriate training paradigm for the task, one that accounts for error patterns of non-native writers using minimal supervision. Existing training approaches present a trade-off between large amounts of cheap data offered by the native-trained models and additional knowledge of learner error patterns provided by the more expensive method of training on annotated learner data. We propose a novel training approach that draws on the strengths offered by the two standard training paradigms—of training either on native or on annotated learner data—and that outperforms both of these standard methods. Using the key observation that parameters relating to error regularities exhibited by non-native writers are relatively simple, we develop models that can incorporate knowledge about error regularities based on a small annotated sample but that are otherwise trained on native English data. The key contribution of this article is the introduction and analysis of two methods for adapting the learned models to error patterns of non-native writers; one method that applies to generative classifiers and a second that applies to discriminative classifiers. Both methods demonstrated state-of-the-art performance in several text correction competitions. In particular, the Illinois system that implements these methods ranked at the top in two recent CoNLL shared tasks on error correction.1 We conduct further evaluation of the proposed approaches studying the effect of using error data from speakers of the same native language, languages that are closely related linguistically, and unrelated languages.
%R 10.1162/COLI_a_00299
%U https://aclanthology.org/J17-4002
%U https://doi.org/10.1162/COLI_a_00299
%P 723-760
Markdown (Informal)
[Adapting to Learner Errors with Minimal Supervision](https://aclanthology.org/J17-4002) (Rozovskaya et al., CL 2017)
ACL