Ani Nenkova


2021

pdf bib
From Toxicity in Online Comments to Incivility in American News: Proceed with Caution
Anushree Hede | Oshin Agarwal | Linda Lu | Diana C. Mutz | Ani Nenkova
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

The ability to quantify incivility online, in news and in congressional debates is of great interest to political scientists. Computational tools for detecting online incivility for English are now fairly accessible and potentially could be applied more broadly. We test the Jigsaw Perspective API for its ability to detect the degree of incivility on a corpus that we developed, consisting of manual annotations of civility in American news. We demonstrate that toxicity models, as exemplified by Perspective, are inadequate for the analysis of incivility in news. We carry out error analysis that points to the need to develop methods to remove spurious correlations between words often mentioned in the news, especially identity descriptors and incivility. Without such improvements, applying Perspective or similar models on news is likely to lead to wrong conclusions, that are not aligned with the human perception of incivility.

pdf bib
The Utility and Interplay of Gazetteers and Entity Segmentation for Named Entity Recognition in English
Oshin Agarwal | Ani Nenkova
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time
Benjamin Nye | Ani Nenkova | Iain Marshall | Byron C. Wallace
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We introduce Trialstreamer, a living database of clinical trial reports. Here we mainly describe the evidence extraction component; this extracts from biomedical abstracts key pieces of information that clinicians need when appraising the literature, and also the relations between these. Specifically, the system extracts descriptions of trial participants, the treatments compared in each arm (the interventions), and which outcomes were measured. The system then attempts to infer which interventions were reported to work best by determining their relationship with identified trial outcome measures. In addition to summarizing individual trials, these extracted data elements allow automatic synthesis of results across many trials on the same topic. We apply the system at scale to all reports of randomized controlled trials indexed in MEDLINE, powering the automatic generation of evidence maps, which provide a global view of the efficacy of different interventions combining data from all relevant clinical trials on a topic. We make all code and models freely available alongside a demonstration of the web interface.

2019

pdf bib
The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization
Simeng Sun | Ani Nenkova
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

ROUGE is widely used to automatically evaluate summarization systems. However, ROUGE measures semantic overlap between a system summary and a human reference on word-string level, much at odds with the contemporary treatment of semantic meaning. Here we present a suite of experiments on using distributed representations for evaluating summarizers, both in reference-based and in reference-free setting. Our experimental results show that the max value over each dimension of the summary ELMo word embeddings is a good representation that results in high correlation with human ratings. Averaging the cosine similarity of all encoders we tested yields high correlation with manual scores in reference-free setting. The distributed representations outperform ROUGE in recent corpora for abstractive news summarization but are less good on test data used in past evaluations.

pdf bib
Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction
Yinfei Yang | Oshin Agarwal | Chris Tar | Byron C. Wallace | Ani Nenkova
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Modern NLP systems require high-quality annotated data. For specialized domains, expert annotations may be prohibitively expensive; the alternative is to rely on crowdsourcing to reduce costs at the risk of introducing noise. In this paper we demonstrate that directly modeling instance difficulty can be used to improve model performance and to route instances to appropriate annotators. Our difficulty prediction model combines two learned representations: a ‘universal’ encoder trained on out of domain data, and a task-specific encoder. Experiments on a complex biomedical information extraction task using expert and lay annotators show that: (i) simply excluding from the training data instances predicted to be difficult yields a small boost in performance; (ii) using difficulty scores to weight instances during training provides further, consistent gains; (iii) assigning instances predicted to be difficult to domain experts is an effective strategy for task routing. Further, our experiments confirm the expectation that for such domain-specific tasks expert annotations are of much higher quality and preferable to obtain if practical and that augmenting small amounts of expert data with a larger set of lay annotations leads to further improvements in model performance.

pdf bib
Emotion Impacts Speech Recognition Performance
Rushab Munot | Ani Nenkova
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

It has been established that the performance of speech recognition systems depends on multiple factors including the lexical content, speaker identity and dialect. Here we use three English datasets of acted emotion to demonstrate that emotional content also impacts the performance of commercial systems. On two of the corpora, emotion is a bigger contributor to recognition errors than speaker identity and on two, neutral speech is recognized considerably better than emotional speech. We further evaluate the commercial systems on spontaneous interactions that contain portions of emotional speech. We propose and validate on the acted datasets, a method that allows us to evaluate the overall impact of emotion on recognition even when manual transcripts are not available. Using this method, we show that emotion in natural spontaneous dialogue is a less prominent but still significant factor in recognition accuracy.

pdf bib
How to Compare Summarizers without Target Length? Pitfalls, Solutions and Re-Examination of the Neural Summarization Literature
Simeng Sun | Ori Shapira | Ido Dagan | Ani Nenkova
Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation

We show that plain ROUGE F1 scores are not ideal for comparing current neural systems which on average produce different lengths. This is due to a non-linear pattern between ROUGE F1 and summary length. To alleviate the effect of length during evaluation, we have proposed a new method which normalizes the ROUGE F1 scores of a system by that of a random system with same average output length. A pilot human evaluation has shown that humans prefer short summaries in terms of the verbosity of a summary but overall consider longer summaries to be of higher quality. While human evaluations are more expensive in time and resources, it is clear that normalization, such as the one we proposed for automatic evaluation, will make human evaluations more meaningful.

pdf bib
Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence
Soham Parikh | Elizabeth Conrad | Oshin Agarwal | Iain Marshall | Byron Wallace | Ani Nenkova
Proceedings of the Workshop on Extracting Structured Knowledge from Scientific Publications

Standard paradigms for search do not work well in the medical context. Typical information needs, such as retrieving a full list of medical interventions for a given condition, or finding the reported efficacy of a particular treatment with respect to a specific outcome of interest cannot be straightforwardly posed in typical text-box search. Instead, we propose faceted-search in which a user specifies a condition and then can browse treatments and outcomes that have been evaluated. Choosing from these, they can access randomized control trials (RCTs) describing individual studies. Realizing such a view of the medical evidence requires information extraction techniques to identify the population, interventions, and outcome measures in an RCT. Patients, health practitioners, and biomedical librarians all stand to benefit from such innovation in search of medical evidence. We present an initial prototype of such an interface applied to pre-registered clinical studies. We also discuss pilot studies into the applicability of information extraction methods to allow for similar access to all published trial results.

pdf bib
Evaluation of named entity coreference
Oshin Agarwal | Sanjay Subramanian | Ani Nenkova | Dan Roth
Proceedings of the Second Workshop on Computational Models of Reference, Anaphora and Coreference

In many NLP applications like search and information extraction for named entities, it is necessary to find all the mentions of a named entity, some of which appear as pronouns (she, his, etc.) or nominals (the professor, the German chancellor, etc.). It is therefore important that coreference resolution systems are able to link these different types of mentions to the correct entity name. We evaluate state-of-the-art coreference resolution systems for the task of resolving all mentions to named entities. Our analysis reveals that standard coreference metrics do not reflect adequately the requirements in this task: they do not penalize systems for not identifying any mentions by name to an entity and they reward systems even if systems find correctly mentions to the same entity but fail to link these to a proper name (she–the student–no name). We introduce new metrics for evaluating named entity coreference that address these discrepancies and show that for the comparisons of competitive systems, standard coreference evaluations could give misleading results for this task. We are, however, able to confirm that the state-of-the art system according to traditional evaluations also performs vastly better than other systems on the named entity coreference task.

pdf bib
Word Embeddings (Also) Encode Human Personality Stereotypes
Oshin Agarwal | Funda Durupınar | Norman I. Badler | Ani Nenkova
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)

Word representations trained on text reproduce human implicit bias related to gender, race and age. Methods have been developed to remove such bias. Here, we present results that show that human stereotypes exist even for much more nuanced judgments such as personality, for a variety of person identities beyond the typically legally protected attributes and that these are similarly captured in word representations. Specifically, we collected human judgments about a person’s Big Five personality traits formed solely from information about the occupation, nationality or a common noun description of a hypothetical person. Analysis of the data reveals a large number of statistically significant stereotypes in people. We then demonstrate the bias captured in lexical representations is statistically significantly correlated with the documented human bias. Our results, showing bias for a large set of person descriptors for such nuanced traits put in doubt the feasibility of broadly and fairly applying debiasing methods and call for the development of new methods for auditing language technology systems and resources.

2018

pdf bib
Syntactic Patterns Improve Information Extraction for Medical Search
Roma Patel | Yinfei Yang | Iain Marshall | Ani Nenkova | Byron Wallace
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest. In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both neural and linear) for information extraction of these medically relevant categories. We present an analysis of the type of patterns exploited and of the semantic space induced for these, i.e., the distributed representations learned for identified multi-token patterns. We show that these learned representations differ substantially from those of the constituent unigrams, suggesting that the patterns capture contextual information that is otherwise lost.

pdf bib
A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature
Benjamin Nye | Junyi Jessy Li | Roma Patel | Yinfei Yang | Iain Marshall | Ani Nenkova | Byron Wallace
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials. Annotations include demarcations of text spans that describe the Patient population enrolled, the Interventions studied and to what they were Compared, and the Outcomes measured (the ‘PICO’ elements). These spans are further annotated at a more granular level, e.g., individual interventions within them are marked and mapped onto a structured medical vocabulary. We acquired annotations from a diverse set of workers with varying levels of expertise and cost. We describe our data collection process and the corpus itself in detail. We then outline a set of challenging NLP tasks that would aid searching of the medical literature and the practice of evidence-based medicine.

pdf bib
Evaluating Multiple System Summary Lengths: A Case Study
Ori Shapira | David Gabay | Hadar Ronen | Judit Bar-Ilan | Yael Amsterdamer | Ani Nenkova | Ido Dagan
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Practical summarization systems are expected to produce summaries of varying lengths, per user needs. While a couple of early summarization benchmarks tested systems across multiple summary lengths, this practice was mostly abandoned due to the assumed cost of producing reference summaries of multiple lengths. In this paper, we raise the research question of whether reference summaries of a single length can be used to reliably evaluate system summaries of multiple lengths. For that, we have analyzed a couple of datasets as a case study, using several variants of the ROUGE metric that are standard in summarization evaluation. Our findings indicate that the evaluation protocol in question is indeed competitive. This result paves the way to practically evaluating varying-length summaries with simple, possibly existing, summarization benchmarks.

2017

pdf bib
Aggregating and Predicting Sequence Labels from Crowd Annotations
An Thanh Nguyen | Byron Wallace | Junyi Jessy Li | Ani Nenkova | Matthew Lease
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks: (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online.

pdf bib
Detecting (Un)Important Content for Single-Document News Summarization
Yinfei Yang | Forrest Bao | Ani Nenkova
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We present a robust approach for detecting intrinsic sentence importance in news, by training on two corpora of document-summary pairs. When used for single-document summarization, our approach, combined with the “beginning of document” heuristic, outperforms a state-of-the-art summarizer and the beginning-of-article baseline in both automatic and manual evaluations. These results represent an important advance because in the absence of cross-document repetition, single document summarizers for news have not been able to consistently outperform the strong beginning-of-article baseline.

2016

pdf bib
Improving the Annotation of Sentence Specificity
Junyi Jessy Li | Bridget O’Daniel | Yi Wu | Wenli Zhao | Ani Nenkova
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We introduce improved guidelines for annotation of sentence specificity, addressing the issues encountered in prior work. Our annotation provides judgements of sentences in context. Rather than binary judgements, we introduce a specificity scale which accommodates nuanced judgements. Our augmented annotation procedure also allows us to define where in the discourse context the lack of specificity can be resolved. In addition, the cause of the underspecification is annotated in the form of free text questions. We present results from a pilot annotation with this new scheme and demonstrate good inter-annotator agreement. We found that the lack of specificity distributes evenly among immediate prior context, long distance prior context and no prior context. We find that missing details that are not resolved in the the prior context are more likely to trigger questions about the reason behind events, “why” and “how”. Our data is accessible at http://www.cis.upenn.edu/~nlp/corpora/lrec16spec.html

pdf bib
Phrase Generalization: a Corpus Study in Multi-Document Abstracts and Original News Alignments
Ariani Di-Felippo | Ani Nenkova
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)

pdf bib
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Kevin Knight | Ani Nenkova | Owen Rambow
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
The Instantiation Discourse Relation: A Corpus Analysis of Its Properties and Improved Detection
Junyi Jessy Li | Ani Nenkova
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2015

pdf bib
System Combination for Multi-document Summarization
Kai Hong | Mitchell Marcus | Ani Nenkova
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Detecting Content-Heavy Sentences: A Cross-Language Case Study
Junyi Jessy Li | Ani Nenkova
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Inducing Lexical Style Properties for Paraphrase and Genre Differentiation
Ellie Pavlick | Ani Nenkova
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Identification and Characterization of Newsworthy Verbs in World News
Benjamin Nye | Ani Nenkova
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Verbose, Laconic or Just Right: A Simple Computational Model of Content Appropriateness under Length Constraints
Annie Louis | Ani Nenkova
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Improving the Estimation of Word Importance for News Multi-Document Summarization
Kai Hong | Ani Nenkova
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR)
Sandra Williams | Advaith Siddharthan | Ani Nenkova
Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR)

pdf bib
Proceedings of the 5th Workshop on Speech and Language Processing for Assistive Technologies
Jan Alexandersson | Dimitra Anastasiou | Cui Jian | Ani Nenkova | Rupal Patel | Frank Rudzicz | Annalu Waller | Desislava Zhekova
Proceedings of the 5th Workshop on Speech and Language Processing for Assistive Technologies

pdf bib
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)
Kallirroi Georgila | Matthew Stone | Helen Hastie | Ani Nenkova
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

pdf bib
Addressing Class Imbalance for Improved Recognition of Implicit Discourse Relations
Junyi Jessy Li | Ani Nenkova
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

pdf bib
Reducing Sparsity Improves the Recognition of Implicit Discourse Relations
Junyi Jessy Li | Ani Nenkova
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

pdf bib
Cross-lingual Discourse Relation Analysis: A corpus study and a semi-supervised classification system
Junyi Jessy Li | Marine Carpuat | Ani Nenkova
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization
Kai Hong | John Conroy | Benoit Favre | Alex Kulesza | Hui Lin | Ani Nenkova
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In the period since 2004, many novel sophisticated approaches for generic multi-document summarization have been developed. Intuitive simple approaches have also been shown to perform unexpectedly well for the task. Yet it is practically impossible to compare the existing approaches directly, because systems have been evaluated on different datasets, with different evaluation measures, against different sets of comparison systems. Here we present a corpus of summaries produced by several state-of-the-art extractive summarization systems or by popular baseline systems. The inputs come from the 2004 DUC evaluation, the latest year in which generic summarization was addressed in a shared task. We use the same settings for ROUGE automatic evaluation to compare the systems directly and analyze the statistical significance of the differences in performance. We show that in terms of average scores the state-of-the-art systems appear similar but that in fact they produce very different summaries. Our corpus will facilitate future research on generic summarization and motivates the need for development of more sensitive evaluation measures and for approaches to system combination in summarization.

pdf bib
Assessing the Discourse Factors that Influence the Quality of Machine Translation
Junyi Jessy Li | Marine Carpuat | Ani Nenkova
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations
Sandra Williams | Advaith Siddharthan | Ani Nenkova
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations

pdf bib
What Makes Writing Great? First Experiments on Article Quality Prediction in the Science Journalism Domain
Annie Louis | Ani Nenkova
Transactions of the Association for Computational Linguistics, Volume 1

Great writing is rare and highly admired. Readers seek out articles that are beautifully written, informative and entertaining. Yet information-access technologies lack capabilities for predicting article quality at this level. In this paper we present first experiments on article quality prediction in the science journalism domain. We introduce a corpus of great pieces of science journalism, along with typical articles from the genre. We implement features to capture aspects of great writing, including surprising, visual and emotional content, as well as general features related to discourse organization and sentence structure. We show that the distinction between great and typical articles can be detected fairly accurately, and that the entire spectrum of our features contribute to the distinction.

pdf bib
A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art
Peter A. Rankel | John M. Conroy | Hoa Trang Dang | Ani Nenkova
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Automatically Assessing Machine Summary Content Without a Gold Standard
Annie Louis | Ani Nenkova
Computational Linguistics, Volume 39, Issue 2 - June 2013

2012

pdf bib
Lexical Differences in Autobiographical Narratives from Schizophrenic Patients and Healthy Controls
Kai Hong | Christian G. Kohler | Mary E. March | Amber A. Parker | Ani Nenkova
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
A Coherence Model Based on Syntactic Patterns
Annie Louis | Ani Nenkova
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Proceedings of the First Workshop on Predicting and Improving Text Readability for target reader populations
Sandra Williams | Advaith Siddharthan | Ani Nenkova
Proceedings of the First Workshop on Predicting and Improving Text Readability for target reader populations

pdf bib
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization
John M. Conroy | Hoa Trang Dang | Ani Nenkova | Karolina Owczarzak
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization

pdf bib
An Assessment of the Accuracy of Automatic Evaluation in Summarization
Karolina Owczarzak | John M. Conroy | Hoa Trang Dang | Ani Nenkova
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization

pdf bib
A corpus of general and specific sentences from news
Annie Louis | Ani Nenkova
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a corpus of sentences from news articles that are annotated as general or specific. We employed annotators on Amazon Mechanical Turk to mark sentences from three kinds of news articles―reports on events, finance news and science journalism. We introduce the resulting corpus, with focus on annotator agreement, proportion of general/specific sentences in the articles and results for automatic classification of the two sentence types.

pdf bib
Acoustic-Prosodic Entrainment and Social Behavior
Rivka Levitan | Agustín Gravano | Laura Willson | S̆tefan Ben̆us̆ | Julia Hirschberg | Ani Nenkova
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Proceedings of the NAACL HLT 2012 Student Research Workshop
Rivka Levitan | Myle Ott | Roger Levy | Ani Nenkova
Proceedings of the NAACL HLT 2012 Student Research Workshop

2011

pdf bib
Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
Ani Nenkova | Julia Hirschberg | Yang Liu
Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages

pdf bib
Text Specificity and Impact on Quality of News Summaries
Annie Louis | Ani Nenkova
Proceedings of the Workshop on Monolingual Text-To-Text Generation

pdf bib
Information Status Distinctions and Referring Expressions: An Empirical Study of References to People in News Summaries
Advaith Siddharthan | Ani Nenkova | Kathleen McKeown
Computational Linguistics, Volume 37, Issue 4 - December 2011

pdf bib
Automatic identification of general and specific sentences by leveraging discourse annotations
Annie Louis | Ani Nenkova
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf bib
Automatic Summarization
Ani Nenkova | Sameer Maskey | Yang Liu
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

2010

pdf bib
Creating Local Coherence: An Empirical Assessment
Annie Louis | Ani Nenkova
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Using entity features to classify implicit discourse relations
Annie Louis | Aravind Joshi | Rashmi Prasad | Ani Nenkova
Proceedings of the SIGDIAL 2010 Conference

pdf bib
Discourse indicators for content selection in summarization
Annie Louis | Aravind Joshi | Ani Nenkova
Proceedings of the SIGDIAL 2010 Conference

pdf bib
Automatic Evaluation of Linguistic Quality in Multi-Document Summarization
Emily Pitler | Annie Louis | Ani Nenkova
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
Automatic sense prediction for implicit discourse relations in text
Emily Pitler | Annie Louis | Ani Nenkova
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Using Syntax to Disambiguate Explicit Discourse Connectives in Text
Emily Pitler | Ani Nenkova
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Predicting the Fluency of Text with Shallow Structural Features: Case Studies of Machine Translation and Human-Written Text
Jieun Chae | Ani Nenkova
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Performance Confidence Estimation for Automatic Summarization
Annie Louis | Ani Nenkova
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Automatically Evaluating Content Selection in Summarization without Human Models
Annie Louis | Ani Nenkova
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Easily Identifiable Discourse Relations
Emily Pitler | Mridhula Raghupathy | Hena Mehta | Ani Nenkova | Alan Lee | Aravind Joshi
Coling 2008: Companion volume: Posters

pdf bib
Revisiting Readability: A Unified Framework for Predicting Text Quality
Emily Pitler | Ani Nenkova
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Entity-driven Rewrite for Multi-document Summarization
Ani Nenkova
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Can You Summarize This? Identifying Correlates of Input Difficulty for Multi-Document Summarization
Ani Nenkova | Annie Louis
Proceedings of ACL-08: HLT

pdf bib
High Frequency Word Entrainment in Spoken Dialogue
Ani Nenkova | Agustín Gravano | Julia Hirschberg
Proceedings of ACL-08: HLT, Short Papers

pdf bib
Tutorial Abstracts of ACL-08: HLT
Ani Nenkova | Marilyn Walker | Eugene Agichtein
Tutorial Abstracts of ACL-08: HLT

2007

pdf bib
To Memorize or to Predict: Prominence labeling in Conversational Speech
Ani Nenkova | Jason Brenier | Anubha Kothari | Sasha Calhoun | Laura Whitton | David Beaver | Dan Jurafsky
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Measuring Importance and Query Relevance in Topic-focused Multi-document Summarization
Surabhi Gupta | Ani Nenkova | Dan Jurafsky
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions

2005

pdf bib
Automatically Learning Cognitive Status for Multi-Document Summarization of Newswire
Ani Nenkova | Advaith Siddharthan | Kathleen McKeown
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Evaluating Content Selection in Summarization: The Pyramid Method
Ani Nenkova | Rebecca Passonneau
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

pdf bib
Syntactic Simplification for Improving Content Selection in Multi-Document Summarization
Advaith Siddharthan | Ani Nenkova | Kathleen McKeown
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
References to Named Entities: a Corpus Study
Ani Nenkova | Kathleen McKeown
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers

pdf bib
Columbia’s Newsblaster: New Features and Future Directions
Kathleen McKeown | Regina Barzilay | John Chen | David Elson | David Evans | Judith Klavans | Ani Nenkova | Barry Schiffman | Sergey Sigelman
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations

Search