Courtney Napoles


2020

pdf bib
Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing
Anne Lauscher | Lily Ng | Courtney Napoles | Joel Tetreault
Proceedings of the 28th International Conference on Computational Linguistics

Though preceding work in computational argument quality (AQ) mostly focuses on assessing overall AQ, researchers agree that writers would benefit from feedback targeting individual dimensions of argumentation theory. However, a large-scale theory-based corpus and corresponding computational models are missing. We fill this gap by conducting an extensive analysis covering three diverse domains of online argumentative writing and presenting GAQCorpus: the first large-scale English multi-domain (community Q&A forums, debate forums, review forums) corpus annotated with theory-based AQ scores. We then propose the first computational approaches to theory-based assessment, which can serve as strong baselines for future work. We demonstrate the feasibility of large-scale AQ annotation, show that exploiting relations between dimensions yields performance improvements, and explore the synergies between theory-based prediction and practical AQ assessment.

pdf bib
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track
Ann Clifton | Courtney Napoles
Proceedings of the 28th International Conference on Computational Linguistics: Industry Track

pdf bib
Creating a Domain-diverse Corpus for Theory-based Argument Quality Assessment
Lily Ng | Anne Lauscher | Joel Tetreault | Courtney Napoles
Proceedings of the 7th Workshop on Argument Mining

Computational models of argument quality (AQ) have focused primarily on assessing the overall quality or just one specific characteristic of an argument, such as its convincingness or its clarity. However, previous work has claimed that assessment based on theoretical dimensions of argumentation could benefit writers, but developing such models has been limited by the lack of annotated data. In this work, we describe GAQCorpus, the first large, domain-diverse annotated corpus of theory-based AQ. We discuss how we designed the annotation task to reliably collect a large number of judgments with crowdsourcing, formulating theory-based guidelines that helped make subjective judgments of AQ more objective. We demonstrate how to identify arguments and adapt the annotation task for three diverse domains. Our work will inform research on theory-based argumentation annotation and enable the creation of more diverse corpora to support computational AQ assessment.

2019

pdf bib
Enabling Robust Grammatical Error Correction in New Domains: Data Sets, Metrics, and Analyses
Courtney Napoles | Maria Nădejde | Joel Tetreault
Transactions of the Association for Computational Linguistics, Volume 7

Until now, grammatical error correction (GEC) has been primarily evaluated on text written by non-native English speakers, with a focus on student essays. This paper enables GEC development on text written by native speakers by providing a new data set and metric. We present a multiple-reference test corpus for GEC that includes 4,000 sentences in two new domains (formal and informal writing by native English speakers) and 2,000 sentences from a diverse set of non-native student writing. We also collect human judgments of several GEC systems on this new test set and perform a meta-evaluation, assessing how reliable automatic metrics are across these domains. We find that commonly used GEC metrics have inconsistent performance across domains, and therefore we propose a new ensemble metric that is robust on all three domains of text.

2018

pdf bib
How do you correct run-on sentences it’s not as easy as it seems
Junchao Zheng | Courtney Napoles | Joel Tetreault | Kostiantyn Omelianchuk
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text

Run-on sentences are common grammatical mistakes but little research has tackled this problem to date. This work introduces two machine learning models to correct run-on sentences that outperform leading methods for related tasks, punctuation restoration and whole-sentence grammatical error correction. Due to the limited annotated data for this error, we experiment with artificially generating training data from clean newswire text. Our findings suggest artificial training data is viable for this task. We discuss implications for correcting run-ons and other types of mistakes that have low coverage in error-annotated corpora.

2017

pdf bib
JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction
Courtney Napoles | Keisuke Sakaguchi | Joel Tetreault
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We present a new parallel corpus, JHU FLuency-Extended GUG corpus (JFLEG) for developing and evaluating grammatical error correction (GEC). Unlike other corpora, it represents a broad range of language proficiency levels and uses holistic fluency edits to not only correct grammatical errors but also make the original text more native sounding. We describe the types of corrections made and benchmark four leading GEC systems on this corpus, identifying specific areas in which they do well and how they can improve. JFLEG fulfills the need for a new gold standard to properly assess the current state of GEC.

pdf bib
Finding Good Conversations Online: The Yahoo News Annotated Comments Corpus
Courtney Napoles | Joel Tetreault | Aasish Pappu | Enrica Rosato | Brian Provenzale
Proceedings of the 11th Linguistic Annotation Workshop

This work presents a dataset and annotation scheme for the new task of identifying “good” conversations that occur online, which we call ERICs: Engaging, Respectful, and/or Informative Conversations. We develop a taxonomy to reflect features of entire threads and individual comments which we believe contribute to identifying ERICs; code a novel dataset of Yahoo News comment threads (2.4k threads and 10k comments) and 1k threads from the Internet Argument Corpus; and analyze the features characteristic of ERICs. This is one of the largest annotated corpora of online human dialogues, with the most detailed set of annotations. It will be valuable for identifying ERICs and other aspects of argumentation, dialogue, and discourse.

pdf bib
GEC into the future: Where are we going and how do we get there?
Keisuke Sakaguchi | Courtney Napoles | Joel Tetreault
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

The field of grammatical error correction (GEC) has made tremendous bounds in the last ten years, but new questions and obstacles are revealing themselves. In this position paper, we discuss the issues that need to be addressed and provide recommendations for the field to continue to make progress, and propose a new shared task. We invite suggestions and critiques from the audience to make the new shared task a community-driven venture.

pdf bib
Systematically Adapting Machine Translation for Grammatical Error Correction
Courtney Napoles | Chris Callison-Burch
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

n this work we adapt machine translation (MT) to grammatical error correction, identifying how components of the statistical MT pipeline can be modified for this task and analyzing how each modification impacts system performance. We evaluate the contribution of each of these components with standard evaluation metrics and automatically characterize the morphological and lexical transformations made in system output. Our model rivals the current state of the art using a fraction of the training data.

2016

pdf bib
There’s No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction
Courtney Napoles | Keisuke Sakaguchi | Joel Tetreault
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Reassessing the Goals of Grammatical Error Correction: Fluency Instead of Grammaticality
Keisuke Sakaguchi | Courtney Napoles | Matt Post | Joel Tetreault
Transactions of the Association for Computational Linguistics, Volume 4

The field of grammatical error correction (GEC) has grown substantially in recent years, with research directed at both evaluation metrics and improved system performance against those metrics. One unvisited assumption, however, is the reliance of GEC evaluation on error-coded corpora, which contain specific labeled corrections. We examine current practices and show that GEC’s reliance on such corpora unnaturally constrains annotation and automatic evaluation, resulting in (a) sentences that do not sound acceptable to native speakers and (b) system rankings that do not correlate with human judgments. In light of this, we propose an alternate approach that jettisons costly error coding in favor of unannotated, whole-sentence rewrites. We compare the performance of existing metrics over different gold-standard annotations, and show that automatic evaluation with our new annotation scheme has very strong correlation with expert rankings (ρ = 0.82). As a result, we advocate for a fundamental and necessary shift in the goal of GEC, from correcting small, labeled error types, to producing text that has native fluency.

pdf bib
Optimizing Statistical Machine Translation for Text Simplification
Wei Xu | Courtney Napoles | Ellie Pavlick | Quanze Chen | Chris Callison-Burch
Transactions of the Association for Computational Linguistics, Volume 4

Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods are limited by the quality and quantity of manually simplified corpora, which are expensive to build. In this paper, we conduct an in-depth adaptation of statistical machine translation to perform text simplification, taking advantage of large-scale paraphrases learned from bilingual texts and a small amount of manual simplifications with multiple references. Our work is the first to design automatic metrics that are effective for tuning and evaluating simplification systems, which will facilitate iterative development for this task.

pdf bib
Sentential Paraphrasing as Black-Box Machine Translation
Courtney Napoles | Chris Callison-Burch | Matt Post
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

pdf bib
The Effect of Multiple Grammatical Errors on Processing Non-Native Writing
Courtney Napoles | Aoife Cahill | Nitin Madnani
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
A Report on the Automatic Evaluation of Scientific Writing Shared Task
Vidas Daudaravicius | Rafael E. Banchs | Elena Volodina | Courtney Napoles
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

2015

pdf bib
Automatically Scoring Freshman Writing: A Preliminary Investigation
Courtney Napoles | Chris Callison-Burch
Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Ground Truth for Grammatical Error Correction Metrics
Courtney Napoles | Keisuke Sakaguchi | Matt Post | Joel Tetreault
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
Problems in Current Text Simplification Research: New Data Can Help
Wei Xu | Chris Callison-Burch | Courtney Napoles
Transactions of the Association for Computational Linguistics, Volume 3

Simple Wikipedia has dominated simplification research in the past 5 years. In this opinion paper, we argue that focusing on Wikipedia limits simplification research. We back up our arguments with corpus analysis and by highlighting statements that other researchers have made in the simplification literature. We introduce a new simplification dataset that is a significant improvement over Simple Wikipedia, and present a novel quantitative-comparative approach to study the quality of simplification data resources.

2012

pdf bib
Annotated Gigaword
Courtney Napoles | Matthew Gormley | Benjamin Van Durme
Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX)

2011

pdf bib
Paraphrastic Sentence Compression with a Character-based Metric: Tightening without Deletion
Courtney Napoles | Chris Callison-Burch | Juri Ganitkevitch | Benjamin Van Durme
Proceedings of the Workshop on Monolingual Text-To-Text Generation

pdf bib
Evaluating Sentence Compression: Pitfalls and Suggested Remedies
Courtney Napoles | Benjamin Van Durme | Chris Callison-Burch
Proceedings of the Workshop on Monolingual Text-To-Text Generation

pdf bib
Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation
Juri Ganitkevitch | Chris Callison-Burch | Courtney Napoles | Benjamin Van Durme
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Learning Simple Wikipedia: A Cogitation in Ascertaining Abecedarian Language
Courtney Napoles | Mark Dredze
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids