Diane Litman

Also published as: Diane J. Litman


2024

pdf bib
Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking
Mohamed Elaraby | Diane Litman | Xiang Lorraine Li | Ahmed Magooda
Findings of the Association for Computational Linguistics: EMNLP 2024

Generating free-text rationales is among the emergent capabilities of Large Language Models (LLMs). These rationales have been found to enhance LLM performance across various NLP tasks. Recently, there has been growing interest in using these rationales to provide insights for various important downstream tasks. In this paper, we analyze generated free-text rationales in tasks with subjective answers, emphasizing the importance of rationalization in such scenarios. We focus on pairwise argument ranking, a highly subjective task with significant potential for real-world applications, such as debate assistance. We evaluate the persuasiveness of rationales generated by nine LLMs to support their subjective choices. Our findings suggest that open-source LLMs, particularly Llama2-70B-chat, are capable of providing highly persuasive rationalizations, surpassing even GPT models. Additionally, our experiments demonstrate that the persuasiveness of the generated rationales can be enhanced by guiding their persuasive elements through prompting or self-refinement techniques.

pdf bib
Adding Argumentation into Human Evaluation of Long Document Abstractive Summarization: A Case Study on Legal Opinions
Mohamed Elaraby | Huihui Xu | Morgan Gray | Kevin Ashley | Diane Litman
Proceedings of the Fourth Workshop on Human Evaluation of NLP Systems (HumEval) @ LREC-COLING 2024

Human evaluation remains the gold standard for assessing abstractive summarization. However, current practices often prioritize constructing evaluation guidelines for fluency, coherence, and factual accuracy, overlooking other critical dimensions. In this paper, we investigate argument coverage in abstractive summarization by focusing on long legal opinions, where summaries must effectively encapsulate the document’s argumentative nature. We introduce a set of human-evaluation guidelines to evaluate generated summaries based on argumentative coverage. These guidelines enable us to assess three distinct summarization models, studying the influence of including argument roles in summarization. Furthermore, we utilize these evaluation scores to benchmark automatic summarization metrics against argument coverage, providing insights into the effectiveness of automated evaluation methods.

pdf bib
Using Large Language Models to Assess Young Students’ Writing Revisions
Tianwen Li | Zhexiong Liu | Lindsay Matsumura | Elaine Wang | Diane Litman | Richard Correnti
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

Although effective revision is the crucial component of writing instruction, few automated writing evaluation (AWE) systems specifically focus on the quality of the revisions students undertake. In this study we investigate the use of a large language model (GPT-4) with Chain-of-Thought (CoT) prompting for assessing the quality of young students’ essay revisions aligned with the automated feedback messages they received. Results indicate that GPT-4 has significant potential for evaluating revision quality, particularly when detailed rubrics are included that describe common revision patterns shown by young writers. However, the addition of CoT prompting did not significantly improve performance. Further examination of GPT-4’s scoring performance across various levels of student writing proficiency revealed variable agreement with human ratings. The implications for improving AWE systems focusing on young students are discussed.

pdf bib
Enhancing Knowledge Retrieval with Topic Modeling for Knowledge-Grounded Dialogue
Nhat Tran | Diane Litman
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Knowledge retrieval is one of the major challenges in building a knowledge-grounded dialogue system. A common method is to use a neural retriever with a distributed approximate nearest-neighbor database to quickly find the relevant knowledge sentences. In this work, we propose an approach that utilizes topic modeling on the knowledge base to further improve retrieval accuracy and as a result, improve response generation. Additionally, we experiment with a large language model (LLM), ChatGPT, to take advantage of the improved retrieval performance to further improve the generation results. Experimental results on two datasets show that our approach can increase retrieval and generation performance. The results also indicate that ChatGPT is a better response generator for knowledge-grounded dialogue when relevant knowledge is provided.

pdf bib
ReflectSumm: A Benchmark for Course Reflection Summarization
Yang Zhong | Mohamed Elaraby | Diane Litman | Ahmed Ashraf Butt | Muhsin Menekse
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This paper introduces ReflectSumm, a novel summarization dataset specifically designed for summarizing students’ reflective writing. The goal of ReflectSumm is to facilitate developing and evaluating novel summarization techniques tailored to real-world scenarios with little training data, with potential implications in the opinion summarization domain in general and the educational domain in particular. The dataset encompasses a diverse range of summarization tasks and includes comprehensive metadata, enabling the exploration of various research questions and supporting different applications. To showcase its utility, we conducted extensive evaluations using multiple state-of-the-art baselines. The results provide benchmarks for facilitating further research in this area.

2023

pdf bib
Predicting Desirable Revisions of Evidence and Reasoning in Argumentative Writing
Tazin Afrin | Diane Litman
Findings of the Association for Computational Linguistics: EACL 2023

We develop models to classify desirable evidence and desirable reasoning revisions in student argumentative writing. We explore two ways to improve classifier performance – using the essay context of the revision, and using the feedback students received before the revision. We perform both intrinsic and extrinsic evaluation for each of our models and report a qualitative analysis. Our results show that while a model using feedback information improves over a baseline model, models utilizing context - either alone or with feedback - are the most successful in identifying desirable revisions.

pdf bib
Towards Argument-Aware Abstractive Summarization of Long Legal Opinions with Summary Reranking
Mohamed Elaraby | Yang Zhong | Diane Litman
Findings of the Association for Computational Linguistics: ACL 2023

We propose a simple approach for the abstractive summarization of long legal opinions that takes into account the argument structure of the document. Legal opinions often contain complex and nuanced argumentation, making it challenging to generate a concise summary that accurately captures the main points of the legal opinion. Our approach involves using argument role information to generate multiple candidate summaries, then reranking these candidates based on alignment with the document’s argument structure. We demonstrate the effectiveness of our approach on a dataset of long legal opinions and show that it outperforms several strong baselines.

pdf bib
STRONG – Structure Controllable Legal Opinion Summary Generation
Yang Zhong | Diane Litman
Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023 (Findings)

pdf bib
Overview of ImageArg-2023: The First Shared Task in Multimodal Argument Mining
Zhexiong Liu | Mohamed Elaraby | Yang Zhong | Diane Litman
Proceedings of the 10th Workshop on Argument Mining

This paper presents an overview of the ImageArg shared task, the first multimodal Argument Mining shared task co-located with the 10th Workshop on Argument Mining at EMNLP 2023. The shared task comprises two classification subtasks - (1) Subtask-A: Argument Stance Classification; (2) Subtask-B: Image Persuasiveness Classification. The former determines the stance of a tweet containing an image and a piece of text toward a controversial topic (e.g., gun control and abortion). The latter determines whether the image makes the tweet text more persuasive. The shared task received 31 submissions for Subtask-A and 21 submissions for Subtask-B from 9 different teams across 6 countries. The top submission in Subtask-A achieved an F1-score of 0.8647 while the best submission in Subtask-B achieved an F1-score of 0.5561.

pdf bib
Predicting the Quality of Revisions in Argumentative Writing
Zhexiong Liu | Diane Litman | Elaine Wang | Lindsay Matsumura | Richard Correnti
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

The ability to revise in response to feedback is critical to students’ writing success. In the case of argument writing in specific, identifying whether an argument revision (AR) is successful or not is a complex problem because AR quality is dependent on the overall content of an argument. For example, adding the same evidence sentence could strengthen or weaken existing claims in different argument contexts (ACs). To address this issue we developed Chain-of-Thought prompts to facilitate ChatGPT-generated ACs for AR quality predictions. The experiments on two corpora, our annotated elementary essays and existing college essays benchmark, demonstrate the superiority of the proposed ACs over baselines.

2022

pdf bib
Getting Better Dialogue Context for Knowledge Identification by Leveraging Document-level Topic Shift
Nhat Tran | Diane Litman
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

To build a goal-oriented dialogue system that can generate responses given a knowledge base, identifying the relevant pieces of information to be grounded in is vital. When the number of documents in the knowledge base is large, retrieval approaches are typically used to identify the top relevant documents. However, most prior work simply uses an entire dialogue history to guide retrieval, rather than exploiting a dialogue’s topical structure. In this work, we examine the importance of building the proper contextualized dialogue history when document-level topic shifts are present. Our results suggest that excluding irrelevant turns from the dialogue history (e.g., excluding turns not grounded in the same document as the current turn) leads to better retrieval results. We also propose a cascading approach utilizing the topical nature of a knowledge-grounded conversation to further manipulate the dialogue history used as input to the retrieval models.

pdf bib
Comparison of Lexical Alignment with a Teachable Robot in Human-Robot and Human-Human-Robot Interactions
Yuya Asano | Diane Litman | Mingzhi Yu | Nikki Lobczowski | Timothy Nokes-Malach | Adriana Kovashka | Erin Walker
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

Speakers build rapport in the process of aligning conversational behaviors with each other. Rapport engendered with a teachable agent while instructing domain material has been shown to promote learning. Past work on lexical alignment in the field of education suffers from limitations in both the measures used to quantify alignment and the types of interactions in which alignment with agents has been studied. In this paper, we apply alignment measures based on a data-driven notion of shared expressions (possibly composed of multiple words) and compare alignment in one-on-one human-robot (H-R) interactions with the H-R portions of collaborative human-human-robot (H-H-R) interactions. We find that students in the H-R setting align with a teachable robot more than in the H-H-R setting and that the relationship between lexical alignment and rapport is more complex than what is predicted by previous theoretical and empirical work.

pdf bib
ImageArg: A Multi-modal Tweet Dataset for Image Persuasiveness Mining
Zhexiong Liu | Meiqi Guo | Yue Dai | Diane Litman
Proceedings of the 9th Workshop on Argument Mining

The growing interest in developing corpora of persuasive texts has promoted applications in automated systems, e.g., debating and essay scoring systems; however, there is little prior work mining image persuasiveness from an argumentative perspective. To expand persuasiveness mining into a multi-modal realm, we present a multi-modal dataset, ImageArg, consisting of annotations of image persuasiveness in tweets. The annotations are based on a persuasion taxonomy we developed to explore image functionalities and the means of persuasion. We benchmark image persuasiveness tasks on ImageArg using widely-used multi-modal learning methods. The experimental results show that our dataset offers a useful resource for this rich and challenging topic, and there is ample room for modeling improvement.

pdf bib
Computing and Exploiting Document Structure to Improve Unsupervised Extractive Summarization of Legal Case Decisions
Yang Zhong | Diane Litman
Proceedings of the Natural Legal Language Processing Workshop 2022

Though many algorithms can be used to automatically summarize legal case decisions, most fail to incorporate domain knowledge about how important sentences in a legal decision relate to a representation of its document structure. For example, analysis of a legal case sum- marization dataset demonstrates that sentences serving different types of argumentative roles in the decision appear in different sections of the document. In this work, we propose an unsupervised graph-based ranking model that uses a reweighting algorithm to exploit properties of the document structure of legal case decisions. We also explore the impact of using different methods to compute the document structure. Results on the Canadian Legal Case Law dataset show that our proposed method outperforms several strong baselines.

pdf bib
ArgLegalSumm: Improving Abstractive Summarization of Legal Documents with Argument Mining
Mohamed Elaraby | Diane Litman
Proceedings of the 29th International Conference on Computational Linguistics

A challenging task when generating summaries of legal documents is the ability to address their argumentative nature. We introduce a simple technique to capture the argumentative structure of legal documents by integrating argument role labeling into the summarization process. Experiments with pretrained language models show that our proposed approach improves performance over strong baselines.

2021

pdf bib
Essay Quality Signals as Weak Supervision for Source-based Essay Scoring
Haoran Zhang | Diane Litman
Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications

Human essay grading is a laborious task that can consume much time and effort. Automated Essay Scoring (AES) has thus been proposed as a fast and effective solution to the problem of grading student writing at scale. However, because AES typically uses supervised machine learning, a human-graded essay corpus is still required to train the AES model. Unfortunately, such a graded corpus often does not exist, so creating a corpus for machine learning can also be a laborious task. This paper presents an investigation of replacing the use of human-labeled essay grades when training an AES system with two automatically available but weaker signals of essay quality: word count and topic distribution similarity. Experiments using two source-based essay scoring (evidence score) corpora show that while weak supervision does not yield a competitive result when training a neural source-based AES model, it can be used to successfully extract Topical Components (TCs) from a source text, which are required by a supervised feature-based AES model. In particular, results show that feature-based AES performance is comparable with either automatically or manually constructed TCs.

pdf bib
Self-trained Pretrained Language Models for Evidence Detection
Mohamed Elaraby | Diane Litman
Proceedings of the 8th Workshop on Argument Mining

Argument role labeling is a fundamental task in Argument Mining research. However, such research often suffers from a lack of large-scale datasets labeled for argument roles such as evidence, which is crucial for neural model training. While large pretrained language models have somewhat alleviated the need for massive manually labeled datasets, how much these models can further benefit from self-training techniques hasn’t been widely explored in the literature in general and in Argument Mining specifically. In this work, we focus on self-trained language models (particularly BERT) for evidence detection. We provide a thorough investigation on how to utilize pseudo labels effectively in the self-training scheme. We also assess whether adding pseudo labels from an out-of-domain source can be beneficial. Experiments on sentence level evidence detection show that self-training can complement pretrained language models to provide performance improvements.

pdf bib
Multi-task Learning in Argument Mining for Persuasive Online Discussions
Nhat Tran | Diane Litman
Proceedings of the 8th Workshop on Argument Mining

We utilize multi-task learning to improve argument mining in persuasive online discussions, in which both micro-level and macro-level argumentation must be taken into consideration. Our models learn to identify argument components and the relations between them at the same time. We also tackle the low-precision which arises from imbalanced relation data by experimenting with SMOTE and XGBoost. Our approaches improve over baselines that use the same pre-trained language model but process the argument component task and two relation tasks separately. Furthermore, our results suggest that the tasks to be incorporated into multi-task learning should be taken into consideration as using all relevant tasks does not always lead to the best performance.

pdf bib
Exploring Multitask Learning for Low-Resource Abstractive Summarization
Ahmed Magooda | Diane Litman | Mohamed Elaraby
Findings of the Association for Computational Linguistics: EMNLP 2021

This paper explores the effect of using multitask learning for abstractive summarization in the context of small training corpora. In particular, we incorporate four different tasks (extractive summarization, language modeling, concept detection, and paraphrase detection) both individually and in combination, with the goal of enhancing the target task of abstractive summarization via multitask learning. We show that for many task combinations, a model trained in a multitask setting outperforms a model trained only for abstractive summarization, with no additional summarization data introduced. Additionally, we do a comprehensive search and find that certain tasks (e.g. paraphrase detection) consistently benefit abstractive summarization, not only when combined with other tasks but also when using different architectures and training corpora.

pdf bib
Mitigating Data Scarceness through Data Synthesis, Augmentation and Curriculum for Abstractive Summarization
Ahmed Magooda | Diane Litman
Findings of the Association for Computational Linguistics: EMNLP 2021

This paper explores three simple data manipulation techniques (synthesis, augmentation, curriculum) for improving abstractive summarization models without the need for any additional data. We introduce a method of data synthesis with paraphrasing, a data augmentation technique with sample mixing, and curriculum learning with two new difficulty metrics based on specificity and abstractiveness. We conduct experiments to show that these three techniques can help improve abstractive summarization across two summarization models and two different small datasets. Furthermore, we show that these techniques can improve performance when applied in isolation and when combined.

2020

pdf bib
Contextual Argument Component Classification for Class Discussions
Luca Lugini | Diane Litman
Proceedings of the 28th International Conference on Computational Linguistics

Argument mining systems often consider contextual information, i.e. information outside of an argumentative discourse unit, when trained to accomplish tasks such as argument component identification, classification, and relation extraction. However, prior work has not carefully analyzed the utility of different contextual properties in context-aware models. In this work, we show how two different types of contextual information, local discourse context and speaker context, can be incorporated into a computational model for classifying argument components in multi-party classroom discussions. We find that both context types can improve performance, although the improvements are dependent on context size and position.

pdf bib
Discussion Tracker: Supporting Teacher Learning about Students’ Collaborative Argumentation in High School Classrooms
Luca Lugini | Christopher Olshefski | Ravneet Singh | Diane Litman | Amanda Godley
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations

Teaching collaborative argumentation is an advanced skill that many K-12 teachers struggle to develop. To address this, we have developed Discussion Tracker, a classroom discussion analytics system based on novel algorithms for classifying argument moves, specificity, and collaboration. Results from a classroom deployment indicate that teachers found the analytics useful, and that the underlying classifiers perform with moderate to substantial agreement with humans.

pdf bib
Annotation and Classification of Evidence and Reasoning Revisions in Argumentative Writing
Tazin Afrin | Elaine Lin Wang | Diane Litman | Lindsay Clare Matsumura | Richard Correnti
Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications

Automated writing evaluation systems can improve students’ writing insofar as students attend to the feedback provided and revise their essay drafts in ways aligned with such feedback. Existing research on revision of argumentative writing in such systems, however, has focused on the types of revisions students make (e.g., surface vs. content) rather than the extent to which revisions actually respond to the feedback provided and improve the essay. We introduce an annotation scheme to capture the nature of sentence-level revisions of evidence use and reasoning (the ‘RER’ scheme) and apply it to 5th- and 6th-grade students’ argumentative essays. We show that reliable manual annotation can be achieved and that revision annotations correlate with a holistic assessment of essay improvement in line with the feedback provided. Furthermore, we explore the feasibility of automatically classifying revisions according to our scheme.

pdf bib
The Discussion Tracker Corpus of Collaborative Argumentation
Christopher Olshefski | Luca Lugini | Ravneet Singh | Diane Litman | Amanda Godley
Proceedings of the Twelfth Language Resources and Evaluation Conference

Although NLP research on argument mining has advanced considerably in recent years, most studies draw on corpora of asynchronous and written texts, often produced by individuals. Few published corpora of synchronous, multi-party argumentation are available. The Discussion Tracker corpus, collected in high school English classes, is an annotated dataset of transcripts of spoken, multi-party argumentation. The corpus consists of 29 multi-party discussions of English literature transcribed from 985 minutes of audio. The transcripts were annotated for three dimensions of collaborative argumentation: argument moves (claims, evidence, and explanations), specificity (low, medium, high) and collaboration (e.g., extensions of and disagreements about others’ ideas). In addition to providing descriptive statistics on the corpus, we provide performance benchmarks and associated code for predicting each dimension separately, illustrate the use of the multiple annotations in the corpus to improve performance via multi-task learning, and finally discuss other ways the corpus might be used to further NLP research.

pdf bib
Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring
Haoran Zhang | Diane Litman
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

While automated essay scoring (AES) can reliably grade essays at scale, automated writing evaluation (AWE) additionally provides formative feedback to guide essay revision. However, a neural AES typically does not provide useful feature representations for supporting AWE. This paper presents a method for linking AWE and neural AES, by extracting Topical Components (TCs) representing evidence from a source text using the intermediate output of attention layers. We evaluate performance using a feature-based AES requiring TCs. Results show that performance is comparable whether using automatically or manually constructed TCs for 1) representing essays as rubric-based features, 2) grading essays.

2018

pdf bib
Annotating Student Talk in Text-based Classroom Discussions
Luca Lugini | Diane Litman | Amanda Godley | Christopher Olshefski
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

Classroom discussions in English Language Arts have a positive effect on students’ reading, writing and reasoning skills. Although prior work has largely focused on teacher talk and student-teacher interactions, we focus on three theoretically-motivated aspects of high-quality student talk: argumentation, specificity, and knowledge domain. We introduce an annotation scheme, then show that the scheme can be used to produce reliable annotations and that the annotations are predictive of discussion quality. We also highlight opportunities provided by our scheme for education and natural language processing research.

pdf bib
Annotation and Classification of Sentence-level Revision Improvement
Tazin Afrin | Diane Litman
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

Studies of writing revisions rarely focus on revision quality. To address this issue, we introduce a corpus of between-draft revisions of student argumentative essays, annotated as to whether each revision improves essay quality. We demonstrate a potential usage of our annotations by developing a machine learning model to predict revision improvement. With the goal of expanding training data, we also extract revisions from a dataset edited by expert proofreaders. Our results indicate that blending expert and non-expert revisions increases model performance, with expert data particularly important for predicting low-quality revisions.

pdf bib
Co-Attention Based Neural Network for Source-Dependent Essay Scoring
Haoran Zhang | Diane Litman
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

This paper presents an investigation of using a co-attention based neural network for source-dependent essay scoring. We use a co-attention mechanism to help the model learn the importance of each part of the essay more accurately. Also, this paper shows that the co-attention based neural network model provides reliable score prediction of source-dependent responses. We evaluate our model on two source-dependent response corpora. Results show that our model outperforms the baseline on both corpora. We also show that the attention of the model is similar to the expert opinions with examples.

pdf bib
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
Kazunori Komatani | Diane Litman | Kai Yu | Alex Papangelis | Lawrence Cavedon | Mikio Nakano
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

pdf bib
Weighting Model Based on Group Dynamics to Measure Convergence in Multi-party Dialogue
Zahra Rahimi | Diane Litman
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

This paper proposes a new weighting method for extending a dyad-level measure of convergence to multi-party dialogues by considering group dynamics instead of simply averaging. Experiments indicate the usefulness of the proposed weighted measure and also show that in general a proper weighting of the dyad-level measures performs better than non-weighted averaging in multiple tasks.

pdf bib
Argument Component Classification for Classroom Discussions
Luca Lugini | Diane Litman
Proceedings of the 5th Workshop on Argument Mining

This paper focuses on argument component classification for transcribed spoken classroom discussions, with the goal of automatically classifying student utterances into claims, evidence, and warrants. We show that an existing method for argument component classification developed for another educationally-oriented domain performs poorly on our dataset. We then show that feature sets from prior work on argument mining for student essays and online dialogues can be used to improve performance considerably. We also provide a comparison between convolutional neural networks and recurrent neural networks when trained under different conditions to classify argument components in classroom discussions. While neural network models are not always able to outperform a logistic regression model, we were able to gain some useful insights: convolutional networks are more robust than recurrent networks both at the character and at the word level, and specificity information can help boost performance in multi-task training.

2017

pdf bib
Predicting Specificity in Classroom Discussion
Luca Lugini | Diane Litman
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

High quality classroom discussion is important to student development, enhancing abilities to express claims, reason about other students’ claims, and retain information for longer periods of time. Previous small-scale studies have shown that one indicator of classroom discussion quality is specificity. In this paper we tackle the problem of predicting specificity for classroom discussions. We propose several methods and feature sets capable of outperforming the state of the art in specificity prediction. Additionally, we provide a set of meaningful, interpretable features that can be used to analyze classroom discussions at a pedagogical level.

pdf bib
Proceedings of the 4th Workshop on Argument Mining
Ivan Habernal | Iryna Gurevych | Kevin Ashley | Claire Cardie | Nancy Green | Diane Litman | Georgios Petasis | Chris Reed | Noam Slonim | Vern Walker
Proceedings of the 4th Workshop on Argument Mining

pdf bib
A Corpus of Annotated Revisions for Studying Argumentative Writing
Fan Zhang | Homa B. Hashemi | Rebecca Hwa | Diane Litman
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper presents ArgRewrite, a corpus of between-draft revisions of argumentative essays. Drafts are manually aligned at the sentence level, and the writer’s purpose for each revision is annotated with categories analogous to those used in argument mining and discourse analysis. The corpus should enable advanced research in writing comparison and revision analysis, as demonstrated via our own studies of student revision behavior and of automatic revision purpose prediction.

pdf bib
Word Embedding for Response-To-Text Assessment of Evidence
Haoran Zhang | Diane Litman
Proceedings of ACL 2017, Student Research Workshop

2016

pdf bib
The Teams Corpus and Entrainment in Multi-Party Spoken Dialogues
Diane Litman | Susannah Paletz | Zahra Rahimi | Stefani Allegretti | Caitlin Rice
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Automatic Summarization of Student Course Feedback
Wencan Luo | Fei Liu | Zitao Liu | Diane Litman
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Using Context to Predict the Purpose of Argumentative Writing Revisions
Fan Zhang | Diane Litman
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Instant Feedback for Increasing the Presence of Solutions in Peer Reviews
Huy Nguyen | Wenting Xiong | Diane Litman
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

pdf bib
ArgRewrite: A Web-based Revision Assistant for Argumentative Writings
Fan Zhang | Rebecca Hwa | Diane Litman | Homa B. Hashemi
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

pdf bib
Automatically Extracting Topical Components for a Response-to-Text Writing Assessment
Zahra Rahimi | Diane Litman
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Extracting PDTB Discourse Relations from Student Essays
Kate Forbes-Riley | Fan Zhang | Diane Litman
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Towards Using Conversations with Spoken Dialogue Systems in the Automated Assessment of Non-Native Speakers of English
Diane Litman | Steve Young | Mark Gales | Kate Knill | Karen Ottewell | Rogier van Dalen | David Vandyke
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
An Improved Phrase-based Approach to Annotating and Summarizing Student Course Responses
Wencan Luo | Fei Liu | Diane Litman
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Teaching large classes remains a great challenge, primarily because it is difficult to attend to all the student needs in a timely manner. Automatic text summarization systems can be leveraged to summarize the student feedback, submitted immediately after each lecture, but it is left to be discovered what makes a good summary for student responses. In this work we explore a new methodology that effectively extracts summary phrases from the student responses. Each phrase is tagged with the number of students who raise the issue. The phrases are evaluated along two dimensions: with respect to text content, they should be informative and well-formed, measured by the ROUGE metric; additionally, they shall attend to the most pressing student needs, measured by a newly proposed metric. This work is enabled by a phrase-based annotation and highlighting scheme, which is new to the summarization task. The phrase-based framework allows us to summarize the student responses into a set of bullet points and present to the instructor promptly.

pdf bib
Inferring Discourse Relations from PDTB-style Discourse Labels for Argumentative Revision Classification
Fan Zhang | Diane Litman | Katherine Forbes Riley
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Penn Discourse Treebank (PDTB)-style annotation focuses on labeling local discourse relations between text spans and typically ignores larger discourse contexts. In this paper we propose two approaches to infer discourse relations in a paragraph-level context from annotated PDTB labels. We investigate the utility of inferring such discourse information using the task of revision classification. Experimental results demonstrate that the inferred information can significantly improve classification performance compared to baselines, not only when PDTB annotation comes from humans but also from automatic parsers.

pdf bib
Context-aware Argumentative Relation Mining
Huy Nguyen | Diane Litman
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2015

pdf bib
Enhancing Instructor-Student and Student-Student Interactions with Mobile Interfaces and Summarization
Wencan Luo | Xiangmin Fan | Muhsin Menekse | Jingtao Wang | Diane Litman
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

pdf bib
Summarizing Student Responses to Reflection Prompts
Wencan Luo | Diane Litman
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Extracting Argument and Domain Words for Identifying Argument Components in Texts
Huy Nguyen | Diane Litman
Proceedings of the 2nd Workshop on Argumentation Mining

pdf bib
Incorporating Coherence of Topics as a Criterion in Automatic Response-to-Text Assessment of the Organization of Writing
Zahra Rahimi | Diane Litman | Elaine Wang | Richard Correnti
Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Annotation and Classification of Argumentative Writing Revisions
Fan Zhang | Diane Litman
Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications

2014

pdf bib
Improving Peer Feedback Prediction: The Sentence Level is Right
Huy Nguyen | Diane Litman
Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Sentence-level Rewriting Detection
Fan Zhang | Diane Litman
Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Proceedings of the First Workshop on Argumentation Mining
Nancy Green | Kevin Ashley | Diane Litman | Chris Reed | Vern Walker
Proceedings of the First Workshop on Argumentation Mining

pdf bib
Ontology-Based Argument Mining and Automatic Essay Scoring
Nathan Ong | Diane Litman | Alexandra Brusilovsky
Proceedings of the First Workshop on Argumentation Mining

pdf bib
Evaluating a Spoken Dialogue System that Detects and Adapts to User Affective States
Diane Litman | Katherine Forbes-Riley
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

pdf bib
Empirical analysis of exploiting review helpfulness for extractive summarization of online reviews
Wenting Xiong | Diane Litman
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Differences in User Responses to a Wizard-of-Oz versus Automated System
Jesse Thomason | Diane Litman
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Reducing Annotation Effort on Unbalanced Corpus based on Cost Matrix
Wencan Luo | Diane Litman | Joel Chan
Proceedings of the 2013 NAACL HLT Student Research Workshop

2012

pdf bib
Intrinsic and Extrinsic Evaluation of an Automatic User Disengagement Detector for an Uncertainty-Adaptive Spoken Dialogue System
Kate Forbes-Riley | Diane Litman | Heather Friedberg | Joanna Drummond
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Cohesion, Entrainment and Task Success in Educational Dialog
Diane Litman
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Adapting to Multiple Affective States in Spoken Dialogue
Kate Forbes-Riley | Diane Litman
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
An Interactive Analytic Tool for Peer-Review Exploration
Wenting Xiong | Diane Litman | Jingtao Wang | Christian Schunn
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

2011

pdf bib
Automatically Predicting Peer-Review Helpfulness
Wenting Xiong | Diane Litman
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Understanding Differences in Perceived Peer-Review Helpfulness using Natural Language Processing
Wenting Xiong | Diane Litman
Proceedings of the Sixth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Predicting Change in Student Motivation by Measuring Cohesion between Tutor and Student
Arthur Ward | Diane Litman | Maxine Eskenazi
Proceedings of the Sixth Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Using Performance Trajectories to Analyze the Immediate Impact of User State Misclassification in an Adaptive Spoken Dialogue System
Kate Forbes-Riley | Diane Litman
Proceedings of the SIGDIAL 2011 Conference

pdf bib
Examining the Impacts of Dialogue Content and System Automation on Affect Models in a Spoken Tutorial Dialogue System
Joanna Drummond | Diane Litman
Proceedings of the SIGDIAL 2011 Conference

2010

pdf bib
Proceedings of the NAACL HLT 2010 Student Research Workshop
Julia Hockenmaier | Diane Litman | Adriane Boyd | Mahesh Joshi | Frank Rudzicz
Proceedings of the NAACL HLT 2010 Student Research Workshop

2009

pdf bib
Discourse Structure and Performance Analysis: Beyond the Correlation
Mihai Rotaru | Diane Litman
Proceedings of the SIGDIAL 2009 Conference

pdf bib
Spoken Tutorial Dialogue and the Feeling of Another’s Knowing
Diane Litman | Kate Forbes-Riley
Proceedings of the SIGDIAL 2009 Conference

pdf bib
Setting Up User Action Probabilities in User Simulations for Dialog System Development
Hua Ai | Diane Litman
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
Assessing Dialog System User Simulation Evaluation Measures Using Human Judges
Hua Ai | Diane J. Litman
Proceedings of ACL-08: HLT

pdf bib
Uncertainty Corpus: Resource to Study User Affect in Complex Spoken Dialogue Systems
Kate Forbes-Riley | Diane Litman | Scott Silliman | Amruta Purandare
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present a corpus of spoken dialogues between students and an adaptive Wizard-of-Oz tutoring system, in which student uncertainty was manually annotated in real-time. We detail the corpus contents, including speech files, transcripts, annotations, and log files, and we discuss possible future uses by the computational linguistics community as a novel resource for studying naturally occurring user affect and adaptation in complex spoken dialogue systems.

2007

pdf bib
Estimating the Reliability of MDP Policies: a Confidence Interval Approach
Joel Tetreault | Dan Bohus | Diane Litman
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Comparing User Simulation Models For Dialog Strategy Learning
Hua Ai | Joel Tetreault | Diane Litman
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

pdf bib
Exploring Affect-Context Dependencies for Adaptive System Development
Kate Forbes-Riley | Mihai Rotaru | Diane Litman | Joel Tetreault
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

pdf bib
The Utility of a Graphical Representation of Discourse Structure in Spoken Dialogue Systems
Mihai Rotaru | Diane Litman
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Comparing Spoken Dialog Corpora Collected with Recruited Subjects versus Real Users
Hua Ai | Antoine Raux | Dan Bohus | Maxine Eskenazi | Diane Litman
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue

2006

pdf bib
Dependencies between Student State and Speech Recognition Problems in Spoken Tutoring Dialogues
Mihai Rotaru | Diane J. Litman
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Characterizing and Predicting Corrections in Spoken Dialogue Systems
Diane Litman | Marc Swerts | Julia Hirschberg
Computational Linguistics, Volume 32, Number 3, September 2006

pdf bib
Using Reinforcement Learning to Build a Better Model of Dialogue State
Joel R. Tetreault | Diane J. Litman
11th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Modelling User Satisfaction and Student Learning in a Spoken Dialogue Tutoring System with Generic, Tutoring, and User Affect Parameters
Kate Forbes-Riley | Diane Litman
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Comparing the Utility of State Features in Spoken Dialogue Using Reinforcement Learning
Joel Tetreault | Diane Litman
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Manual Annotation of Opinion Categories in Meetings
Swapna Somasundaran | Janyce Wiebe | Paul Hoffmann | Diane Litman
Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora 2006

pdf bib
Discourse and Dialogue Processing in Spoken Intelligent Tutoring Systems
Diane J. Litman
Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue

pdf bib
Exploiting Discourse Structure for Spoken Dialogue Performance Analysis
Mihai Rotaru | Diane J. Litman
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Humor: Prosody Analysis and Automatic Recognition for F*R*I*E*N*D*S*
Amruta Purandare | Diane Litman
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

2005

pdf bib
Predicting Learning in Tutoring with the Landscape Model of Memory
Arthur Ward | Diane Litman
Proceedings of the Second Workshop on Building Educational Applications Using NLP

pdf bib
Using Bigrams to Identify Relationships Between Student Certainness States and Tutor Responses in a Spoken Dialogue Corpus
Kate Forbes-Riley | Diane J. Litman
Proceedings of the 6th SIGdial Workshop on Discourse and Dialogue

2004

pdf bib
Predicting Student Emotions in Computer-Human Tutoring Dialogues
Diane J. Litman | Kate Forbes-Riley
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
Co-training for Predicting Emotions with Spoken Dialogue Data
Beatriz Maeireizo | Diane Litman | Rebecca Hwa
Proceedings of the ACL Interactive Poster and Demonstration Sessions

pdf bib
Annotating Student Emotional States in Spoken Tutoring Dialogues
Diane J. Litman | Kate Forbes-Riley
Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue at HLT-NAACL 2004

pdf bib
Predicting Emotion in Spoken Dialogue from Multiple Knowledge Sources
Kate Forbes-Riley | Diane Litman
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

pdf bib
ITSPOKE: An Intelligent Tutoring Spoken Dialogue System
Diane J. Litman | Scott Silliman
Demonstration Papers at HLT-NAACL 2004

2003

pdf bib
Towards Emotion Prediction in Spoken Tutoring Dialogues
Diane Litman | Kate Forbes | Scott Silliman
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers

pdf bib
A Comparison of Tutor and Student Behavior in Speech Versus Text Based Tutoring
Carolyn P. Rosé | Diane Litman | Dumisizwe Bhembe | Kate Forbes | Scott Silliman | Ramesh Srivastava | Kurt VanLehn
Proceedings of the HLT-NAACL 03 Workshop on Building Educational Applications Using Natural Language Processing

pdf bib
Exceptionality and Natural Language Learning
Mihai Rotaru | Diane J. Litman
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

2001

pdf bib
Identifying User Corrections Automatically in Spoken Dialogue Systems
Julia Hirschberg | Diane Litman | Marc Swerts
Second Meeting of the North American Chapter of the Association for Computational Linguistics

pdf bib
Labeling Corrections and Aware Sites in Spoken Dialogue Systems
Julia Hirschberg | Marc Swerts | Diane Litman
Proceedings of the Second SIGdial Workshop on Discourse and Dialogue

pdf bib
Predicting User Reactions to System Error
Diane Litman | Julia Hirschberg | Marc Swerts
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

2000

pdf bib
NJFun- A Reinforcement Learning Spoken Dialogue System
Diane Litman | Satinder Singh | Michael Kearns | Marilyn Walker
ANLP-NAACL 2000 Workshop: Conversational Systems

pdf bib
Learning to Predict Problematic Situations in a Spoken Dialogue System: Experiments with How May I Help You?
Marilyn Walker | Irene Langkilde | Jerry Wright | Allen Gorin | Diane Litman
1st Meeting of the North American Chapter of the Association for Computational Linguistics

pdf bib
Predicting Automatic Speech Recognition Performance Using Prosodic Cues
Diane J. Litman | Julia B. Hirschberg | Marc Swerts
1st Meeting of the North American Chapter of the Association for Computational Linguistics

pdf bib
Automatic Optimization of Dialogue Management
Diane J. Litman | Michael S. Kearns | Satinder Singh | Marilyn A. Walker
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

1999

pdf bib
Automatic Detection of Poor Speech Recognition at the Dialogue Level
Diane J. Litman | Marilyn A. Walker | Michael S. Kearns
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics

1998

pdf bib
Evaluating Response Strategies in a Web-Based Spoken Dialogue Agent
Diane J. Litman | Shimei Pan | Marilyn A. Walker
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib
Evaluating Response Strategies in a Web-Based Spoken Dialogue Agent
Diane J. Litman | Shimei Pan | Marilyn A. Walker
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

1997

pdf bib
Evaluating Interactive Dialogue Systems: Extending Component Evaluation to Integrated System Evaluation
Marilyn A. Walker | Diane J. Litman | Candace A. Kamm | Alicia Abella
Interactive Spoken Dialog Systems: Bringing Speech and NLP Together in Real Applications

pdf bib
PARADISE: A Framework for Evaluating Spoken Dialogue Agents
Marilyn A. Walker | Diane J. Litman | Candace A. Kamm | Alicia Abella
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Discourse Segmentation by Human and Automated Means
Rebecca J. Passonneau | Diane J. Litman
Computational Linguistics, Volume 23, Number 1, March 1997

1995

pdf bib
Combining Multiple Knowledge Sources for Discourse Segmentation
Diane J. Litman | Rebecca J. Passonneau
33rd Annual Meeting of the Association for Computational Linguistics

1993

pdf bib
Empirical Evidence for Intention-Based Discourse Segmentation
Diane J. Litman | Rebecca J. Passonneau
Intentionality and Structure in Discourse Relations

pdf bib
Empirical Studies on the Disambiguation of Cue Phrases
Julia Hirschberg | Diane Litman
Computational Linguistics, Volume 19, Number 3, September 1993

pdf bib
Intention-Based Segmentation: Human Reliability and Correlation With Linguistic Cues
Rebecca J. Passonneau | Diane J. Litman
31st Annual Meeting of the Association for Computational Linguistics

1992

pdf bib
Extracting Constraints on Word Usage from Large Text Corpora
Kathleen McKeown | Diane Litman | Rebecca Passonneau
Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992

1990

pdf bib
Disambiguating Cue Phrases in Text and Speech
Diane Litman | Julia Hirschberg
COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics

1987

pdf bib
Now Let’s Talk About Now; Identifying Cue Phrases Intonationally
Julia Hirschberg | Diane Litman
25th Annual Meeting of the Association for Computational Linguistics

1986

pdf bib
Linguistic Coherence: A Plan-Based Alternative
Diane J. Litman
24th Annual Meeting of the Association for Computational Linguistics

1984

pdf bib
A Plan Recognition Model for Clarification Subdialogues
Diane J. Litman | James F. Allen
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

Search
Co-authors