Cecilia Ovesdotter Alm

Also published as: Cecilia O. Alm, Cecilia O. Alm, Cecilia Ovesdotter Alm


2024

pdf bib
FUSE - FrUstration and Surprise Expressions: A Subtle Emotional Multimodal Language Corpus
Rajesh Titung | Cecilia Ovesdotter Alm
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This study introduces a novel multimodal corpus for expressive task-based spoken language and dialogue, focused on language use under frustration and surprise, elicited from three tasks motivated by prior research and collected in an IRB-approved experiment. The resource is unique both because these are understudied affect states for emotion modeling in language, and also because it provides both individual and dyadic multimodally grounded language. The study includes a detailed analysis of annotations and performance results for multimodal emotion inference in language use.

pdf bib
MULTICOLLAB: A Multimodal Corpus of Dialogues for Analyzing Collaboration and Frustration in Language
Michael Peechatt | Cecilia Ovesdotter Alm | Reynold Bailey
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This paper addresses an existing resource gap for studying complex emotional states when a speaker collaborates with a partner to solve a task. We present a novel dialogue resource — the MULTICOLLAB corpus — where two interlocutors, an instructor and builder, communicated through a Zoom call while sensors recorded eye gaze, facial action units, and galvanic skin response, with transcribed speech signals, resulting in a unique, heavily multimodal corpus. The builder received instructions from the instructor. Half of the builders were privately told to disobey the instructor’s directions. After the task, participants watched the Zoom recording and annotated their instances of frustration. In this study, we introduce this new corpus and perform computational experiments with time series transformers, using early fusion through time for sensor data and late fusion for speech transcripts. We then average predictions from both methods to recognize instructor frustration. Using sensor and speech data in a 4.5 second time window, we find that the fusion of both models yields 21% improvement in classification accuracy (with a precision of 79% and F1 of 63%) over a comparison baseline, demonstrating that complex emotions can be recognized when rich multimodal data from transcribed spoken dialogue and biophysical sensor data are fused.

2022

pdf bib
Transfer Learning Methods for Domain Adaptation in Technical Logbook Datasets
Farhad Akhbardeh | Marcos Zampieri | Cecilia Ovesdotter Alm | Travis Desell
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Event identification in technical logbooks poses challenges given the limited logbook data available in specific technical domains, the large set of possible classes, and logbook entries typically being in short form and non-standard technical language. Technical logbook data typically has both a domain, the field it comes from (e.g., automotive), and an application, what it is used for (e.g., maintenance). In order to better handle the problem of data scarcity, using a variety of technical logbook datasets, this paper investigates the benefits of using transfer learning from sources within the same domain (but different applications), from within the same application (but different domains) and from all available data. Results show that performing transfer learning within a domain provides statistically significant improvements, and in all cases but one the best performance. Interestingly, transfer learning from within the application or across the global dataset degrades results in all cases but one, which benefited from adding as much data as possible. A further analysis of the dataset similarities shows that the datasets with higher similarity scores performed better in transfer learning tasks, suggesting that this can be utilized to determine the effectiveness of adding a dataset in a transfer learning task for technical logbooks.

pdf bib
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorial Abstracts
Miguel Ballesteros | Yulia Tsvetkov | Cecilia O. Alm
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Tutorial Abstracts

2021

pdf bib
Unpacking the Interdependent Systems of Discrimination: Ableist Bias in NLP Systems through an Intersectional Lens
Saad Hassan | Matt Huenerfauth | Cecilia Ovesdotter Alm
Findings of the Association for Computational Linguistics: EMNLP 2021

Much of the world’s population experiences some form of disability during their lifetime. Caution must be exercised while designing natural language processing (NLP) systems to prevent systems from inadvertently perpetuating ableist bias against people with disabilities, i.e., prejudice that favors those with typical abilities. We report on various analyses based on word predictions of a large-scale BERT language model. Statistically significant results demonstrate that people with disabilities can be disadvantaged. Findings also explore overlapping forms of discrimination related to interconnected gender and race identities.

pdf bib
Handling Extreme Class Imbalance in Technical Logbook Datasets
Farhad Akhbardeh | Cecilia Ovesdotter Alm | Marcos Zampieri | Travis Desell
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Technical logbooks are a challenging and under-explored text type in automated event identification. These texts are typically short and written in non-standard yet technical language, posing challenges to off-the-shelf NLP pipelines. The granularity of issue types described in these datasets additionally leads to class imbalance, making it challenging for models to accurately predict which issue each logbook entry describes. In this paper we focus on the problem of technical issue classification by considering logbook datasets from the automotive, aviation, and facilities maintenance domains. We adapt a feedback strategy from computer vision for handling extreme class imbalance, which resamples the training data based on its error in the prediction process. Our experiments show that with statistical significance this feedback strategy provides the best results for four different neural network models trained across a suite of seven different technical logbook datasets from distinct technical domains. The feedback strategy is also generic and could be applied to any learning problem with substantial class imbalances.

2018

pdf bib
A dataset for identifying actionable feedback in collaborative software development
Benjamin S. Meyers | Nuthan Munaiah | Emily Prud’hommeaux | Andrew Meneely | Josephine Wolff | Cecilia Ovesdotter Alm | Pradeep Murukannaiah
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Software developers and testers have long struggled with how to elicit proactive responses from their coworkers when reviewing code for security vulnerabilities and errors. For a code review to be successful, it must not only identify potential problems but also elicit an active response from the colleague responsible for modifying the code. To understand the factors that contribute to this outcome, we analyze a novel dataset of more than one million code reviews for the Google Chromium project, from which we extract linguistic features of feedback that elicited responsive actions from coworkers. Using a manually-labeled subset of reviewer comments, we trained a highly accurate classifier to identify acted-upon comments (AUC = 0.85). Our results demonstrate the utility of our dataset, the feasibility of using NLP for this new task, and the potential of NLP to improve our understanding of how communications between colleagues can be authored to elicit positive, proactive responses.

pdf bib
SNAG: Spoken Narratives and Gaze Dataset
Preethi Vaidyanathan | Emily T. Prud’hommeaux | Jeff B. Pelz | Cecilia O. Alm
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Humans rely on multiple sensory modalities when examining and reasoning over images. In this paper, we describe a new multimodal dataset that consists of gaze measurements and spoken descriptions collected in parallel during an image inspection task. The task was performed by multiple participants on 100 general-domain images showing everyday objects and activities. We demonstrate the usefulness of the dataset by applying an existing visual-linguistic data fusion framework in order to label important image regions with appropriate linguistic labels.

pdf bib
Sensing and Learning Human Annotators Engaged in Narrative Sensemaking
McKenna Tornblad | Luke Lapresi | Christopher Homan | Raymond Ptucha | Cecilia Ovesdotter Alm
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

While labor issues and quality assurance in crowdwork are increasingly studied, how annotators make sense of texts and how they are personally impacted by doing so are not. We study these questions via a narrative-sorting annotation task, where carefully selected (by sequentiality, topic, emotional content, and length) collections of tweets serve as examples of everyday storytelling. As readers process these narratives, we measure their facial expressions, galvanic skin response, and self-reported reactions. From the perspective of annotator well-being, a reassuring outcome was that the sorting task did not cause a measurable stress response, however readers reacted to humor. In terms of sensemaking, readers were more confident when sorting sequential, target-topical, and highly emotional tweets. As crowdsourcing becomes more common, this research sheds light onto the perceptive capabilities and emotional impact of human readers.

2017

pdf bib
Understanding the Semantics of Narratives of Interpersonal Violence through Reader Annotations and Physiological Reactions
Alexander Calderwood | Elizabeth A. Pruett | Raymond Ptucha | Christopher Homan | Cecilia Ovesdotter Alm
Proceedings of the Workshop Computational Semantics Beyond Events and Roles

Interpersonal violence (IPV) is a prominent sociological problem that affects people of all demographic backgrounds. By analyzing how readers interpret, perceive, and react to experiences narrated in social media posts, we explore an understudied source for discourse about abuse. We asked readers to annotate Reddit posts about relationships with vs. without IPV for stakeholder roles and emotion, while measuring their galvanic skin response (GSR), pulse, and facial expression. We map annotations to coreference resolution output to obtain a labeled coreference chain for stakeholders in texts, and apply automated semantic role labeling for analyzing IPV discourse. Findings provide insights into how readers process roles and emotion in narratives. For example, abusers tend to be linked with violent actions and certain affect states. We train classifiers to predict stakeholder categories of coreference chains. We also find that subjects’ GSR noticeably changed for IPV texts, suggesting that co-collected measurement-based data about annotators can be used to support text annotation.

pdf bib
An Analysis and Visualization Tool for Case Study Learning of Linguistic Concepts
Cecilia Ovesdotter Alm | Benjamin Meyers | Emily Prud’hommeaux
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present an educational tool that integrates computational linguistics resources for use in non-technical undergraduate language science courses. By using the tool in conjunction with evidence-driven pedagogical case studies, we strive to provide opportunities for students to gain an understanding of linguistic concepts and analysis through the lens of realistic problems in feasible ways. Case studies tend to be used in legal, business, and health education contexts, but less in the teaching and learning of linguistics. The approach introduced also has potential to encourage students across training backgrounds to continue on to computational language analysis coursework.

pdf bib
Proceedings of ACL 2017, Student Research Workshop
Allyson Ettinger | Spandana Gella | Matthieu Labeau | Cecilia Ovesdotter Alm | Marine Carpuat | Mark Dredze
Proceedings of ACL 2017, Student Research Workshop

2016

pdf bib
Towards Early Dementia Detection: Fusing Linguistic and Non-Linguistic Clinical Data
Joseph Bullard | Cecilia Ovesdotter Alm | Xumin Liu | Qi Yu | Rubén Proaño
Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology

pdf bib
Generating Clinically Relevant Texts: A Case Study on Life-Changing Events
Mayuresh Oak | Anil Behera | Titus Thomas | Cecilia Ovesdotter Alm | Emily Prud’hommeaux | Christopher Homan | Raymond Ptucha
Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology

pdf bib
Understanding Discourse on Work and Job-Related Well-Being in Public Social Media
Tong Liu | Christopher Homan | Cecilia Ovesdotter Alm | Megan Lytle | Ann Marie White | Henry Kautz
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Analyzing Gender Bias in Student Evaluations
Andamlak Terkik | Emily Prud’hommeaux | Cecilia Ovesdotter Alm | Christopher Homan | Scott Franklin
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

University students in the United States are routinely asked to provide feedback on the quality of the instruction they have received. Such feedback is widely used by university administrators to evaluate teaching ability, despite growing evidence that students assign lower numerical scores to women and people of color, regardless of the actual quality of instruction. In this paper, we analyze students’ written comments on faculty evaluation forms spanning eight years and five STEM disciplines in order to determine whether open-ended comments reflect these same biases. First, we apply sentiment analysis techniques to the corpus of comments to determine the overall affect of each comment. We then use this information, in combination with other features, to explore whether there is bias in how students describe their instructors. We show that while the gender of the evaluated instructor does not seem to affect students’ expressed level of overall satisfaction with their instruction, it does strongly influence the language that they use to describe their instructors and their experience in class.

2015

pdf bib
Alignment of Eye Movements and Spoken Language for Semantic Image Understanding
Preethi Vaidyanathan | Emily Prud’hommeaux | Cecilia O. Alm | Jeff B. Pelz | Anne R. Haake
Proceedings of the 11th International Conference on Computational Semantics

pdf bib
Computational Integration of Human Vision and Natural Language through Bitext Alignment
Preethi Vaidyanathan | Emily Prud’hommeaux | Cecilia O. Alm | Jeff B. Pelz | Anne R. Haake
Proceedings of the Fourth Workshop on Vision and Language

pdf bib
#WhyIStayed, #WhyILeft: Microblogging to Make Sense of Domestic Abuse
Nicolas Schrading | Cecilia Ovesdotter Alm | Raymond Ptucha | Christopher Homan
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
An Analysis of Domestic Abuse Discourse on Reddit
Nicolas Schrading | Cecilia Ovesdotter Alm | Ray Ptucha | Christopher Homan
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

bib
Computational Analysis of Affect and Emotion in Language
Saif Mohammad | Cecilia Ovesdotter Alm
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Computational linguistics has witnessed a surge of interest in approaches to emotion and affect analysis, tackling problems that extend beyond sentiment analysis in depth and complexity. This area involves basic emotions (such as joy, sadness, and fear) as well as any of the hundreds of other emotions humans are capable of (such as optimism, frustration, and guilt), expanding into affective conditions, experiences, and activities. Leveraging linguistic data for computational affect and emotion inference enables opportunities to address a range of affect-related tasks, problems, and non-invasive applications that capture aspects essential to the human condition and individuals’ cognitive processes. These efforts enable and facilitate human-centered computing experiences, as demonstrated by applications across clinical, socio-political, artistic, educational, and commercial domains. Efforts to computationally detect, characterize, and generate emotions or affect-related phenomena respond equally to technological needs for personalized, micro-level analytics and broad-coverage, macro-level inference, and they have involved both small and massive amounts of data.While this is an exciting area with numerous opportunities for members of the ACL community, a major obstacle is its intersection with other investigatory traditions, necessitating knowledge transfer. This tutorial comprehensively integrates relevant concepts and frameworks from linguistics, cognitive science, affective computing, and computational linguistics in order to equip researchers and practitioners with the adequate background and knowledge to work effectively on problems and tasks either directly involving, or benefiting from having an understanding of, affect and emotion analysis.There is a substantial body of work in traditional sentiment analysis focusing on positive and negative sentiment. This tutorial covers approaches and features that migrate well to affect analysis. We also discuss key differences from sentiment analysis, and their implications for analyzing affect and emotion.The tutorial begins with an introduction that highlights opportunities, key terminology, and interesting tasks and challenges (1). The body of the tutorial covers characteristics of emotive language use with emphasis on relevance for computational analysis (2); linguistic data—from conceptual analysis frameworks via useful existing resources to important annotation topics (3); computational approaches for lexical semantic emotion analysis (4); computational approaches for emotion and affect analysis in text (5); visualization methods (6); and a survey of application areas with affect-related problems (7). The tutorial concludes with an outline of future directions and a discussion with participants about the areas relevant to their respective tasks of interest (8).Besides attending the tutorial, tutorial participants receive electronic copies of tutorial slides, a complete reference list, as well as a categorized annotated bibliography that concentrates on seminal works, recent important publications, and other products and resources for researchers and developers.

2014

pdf bib
Towards multimodal modeling of physicians’ diagnostic confidence and self-awareness using medical narratives
Joseph Bullard | Cecilia Ovesdotter Alm | Qi Yu | Pengcheng Shi | Anne Haake
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Computational analysis to explore authors’ depiction of characters
Joseph Bullard | Cecilia Ovesdotter Alm
Proceedings of the 3rd Workshop on Computational Linguistics for Literature (CLFL)

pdf bib
Toward Macro-Insights for Suicide Prevention: Analyzing Fine-Grained Distress at Scale
Christopher Homan | Ravdeep Johar | Tong Liu | Megan Lytle | Vincent Silenzio | Cecilia Ovesdotter Alm
Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality

pdf bib
Decision Style in a Clinical Reasoning Corpus
Limor Hochberg | Cecilia Ovesdotter Alm | Esa M. Rantanen | Caroline M. DeLong | Anne Haake
Proceedings of BioNLP 2014

pdf bib
Towards Automatic Annotation of Clinical Decision-Making Style
Limor Hochberg | Cecilia Ovesdotter Alm | Esa M. Rantanen | Qi Yu | Caroline M. DeLong | Anne Haake
Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop

2012

pdf bib
Detecting Distressed and Non-distressed Affect States in Short Forum Texts
Michael Thaul Lehrman | Cecilia Ovesdotter Alm | Rubén A. Proaño
Proceedings of the Second Workshop on Language in Social Media

pdf bib
Annotation Schemes to Encode Domain Knowledge in Medical Narratives
Wilson McCoy | Cecilia Ovesdotter Alm | Cara Calvelli | Rui Li | Jeff B. Pelz | Pengcheng Shi | Anne Haake
Proceedings of the Sixth Linguistic Annotation Workshop

pdf bib
Disfluencies as Extra-Propositional Indicators of Cognitive Processing
Kathryn Womack | Wilson McCoy | Cecilia Ovesdotter Alm | Cara Calvelli | Jeff B. Pelz | Pengcheng Shi | Anne Haake
Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics

pdf bib
Linking Uncertainty in Physicians’ Narratives to Diagnostic Correctness
Wilson McCoy | Cecilia Ovesdotter Alm | Cara Calvelli | Jeff B. Pelz | Pengcheng Shi | Anne Haake
Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics

2011

pdf bib
Subjective Natural Language Problems: Motivations, Applications, Characterizations, and Implications
Cecilia Ovesdotter Alm
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Characteristics of High Agreement Affect Annotation in Text
Cecilia Ovesdotter Alm
Proceedings of the Fourth Linguistic Annotation Workshop

2006

pdf bib
Challenges for Annotating Images for Sense Disambiguation
Cecilia Ovesdotter Alm | Nicolas Loeff | David A. Forsyth
Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora 2006

pdf bib
Discriminating Image Senses by Clustering with Multimodal Features
Nicolas Loeff | Cecilia Ovesdotter Alm | David A. Forsyth
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2005

pdf bib
Emotions from Text: Machine Learning for Text-based Emotion Prediction
Cecilia Ovesdotter Alm | Dan Roth | Richard Sproat
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing