Natalie Parde


pdf bib
The AI Doctor Is In: A Survey of Task-Oriented Dialogue Systems for Healthcare Applications
Mina Valizadeh | Natalie Parde
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Task-oriented dialogue systems are increasingly prevalent in healthcare settings, and have been characterized by a diverse range of architectures and objectives. Although these systems have been surveyed in the medical community from a non-technical perspective, a systematic review from a rigorous computational perspective has to date remained noticeably absent. As a result, many important implementation details of healthcare-oriented dialogue systems remain limited or underspecified, slowing the pace of innovation in this area. To fill this gap, we investigated an initial pool of 4070 papers from well-known computer science, natural language processing, and artificial intelligence venues, identifying 70 papers discussing the system-level implementation of task-oriented dialogue systems for healthcare applications. We conducted a comprehensive technical review of these papers, and present our key findings including identified gaps and corresponding recommendations.

pdf bib
How You Say It Matters: Measuring the Impact of Verbal Disfluency Tags on Automated Dementia Detection
Shahla Farzana | Ashwin Deshpande | Natalie Parde
Proceedings of the 21st Workshop on Biomedical Language Processing

Automatic speech recognition (ASR) systems usually incorporate postprocessing mechanisms to remove disfluencies, facilitating the generation of clear, fluent transcripts that are conducive to many downstream NLP tasks. However, verbal disfluencies have proved to be predictive of dementia status, although little is known about how various types of verbal disfluencies, nor automatically detected disfluencies, affect predictive performance. We experiment with an off-the-shelf disfluency annotator to tag disfluencies in speech transcripts for a well-known cognitive health assessment task. We evaluate the performance of this model on detecting repetitions and corrections or retracing, and measure the influence of gold annotated versus automatically detected verbal disfluencies on dementia detection through a series of experiments. We find that removing both gold and automatically-detected disfluencies negatively impacts dementia detection performance, degrading classification accuracy by 5.6% and 3% respectively


pdf bib
Identifying Medical Self-Disclosure in Online Communities
Mina Valizadeh | Pardis Ranjbar-Noiey | Cornelia Caragea | Natalie Parde
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Self-disclosure in online health conversations may offer a host of benefits, including earlier detection and treatment of medical issues that may have otherwise gone unaddressed. However, research analyzing medical self-disclosure in online communities is limited. We address this shortcoming by introducing a new dataset of health-related posts collected from online social platforms, categorized into three groups (No Self-Disclosure, Possible Self-Disclosure, and Clear Self-Disclosure) with high inter-annotator agreement (_k_=0.88). We make this data available to the research community. We also release a predictive model trained on this dataset that achieves an accuracy of 81.02%, establishing a strong performance benchmark for this task.

pdf bib
Using Deep Learning to Correlate Reddit Posts with Economic Time Series During the COVID-19 Pandemic
Philip Hossu | Natalie Parde
Proceedings of the Third Workshop on Financial Technology and Natural Language Processing


pdf bib
UIC-NLP at SemEval-2020 Task 10: Exploring an Alternate Perspective on Evaluation
Philip Hossu | Natalie Parde
Proceedings of the Fourteenth Workshop on Semantic Evaluation

In this work we describe and analyze a supervised learning system for word emphasis selection in phrases drawn from visual media as a part of the Semeval 2020 Shared Task 10. More specifically, we begin by briefly introducing the shared task problem and provide an analysis of interesting and relevant features present in the training dataset. We then introduce our LSTM-based model and describe its structure, input features, and limitations. Our model ultimately failed to beat the benchmark score, achieving an average match() score of 0.704 on the validation data (0.659 on the test data) but predicted 84.8% of words correctly considering a 0.5 threshold. We conclude with a thorough analysis and discussion of erroneous predictions with many examples and visualizations.

pdf bib
Modeling Dialogue in Conversational Cognitive Health Screening Interviews
Shahla Farzana | Mina Valizadeh | Natalie Parde
Proceedings of the 12th Language Resources and Evaluation Conference

Automating straightforward clinical tasks can reduce workload for healthcare professionals, increase accessibility for geographically-isolated patients, and alleviate some of the economic burdens associated with healthcare. A variety of preliminary screening procedures are potentially suitable for automation, and one such domain that has remained underexplored to date is that of structured clinical interviews. A task-specific dialogue agent is needed to automate the collection of conversational speech for further (either manual or automated) analysis, and to build such an agent, a dialogue manager must be trained to respond to patient utterances in a manner similar to a human interviewer. To facilitate the development of such an agent, we propose an annotation schema for assigning dialogue act labels to utterances in patient-interviewer conversations collected as part of a clinically-validated cognitive health screening task. We build a labeled corpus using the schema, and show that it is characterized by high inter-annotator agreement. We establish a benchmark dialogue act classification model for the corpus, thereby providing a proof of concept for the proposed annotation schema. The resulting dialogue act corpus is the first such corpus specifically designed to facilitate automated cognitive health screening, and lays the groundwork for future exploration in this area.


pdf bib
The Steep Road to Happily Ever after: an Analysis of Current Visual Storytelling Models
Yatri Modi | Natalie Parde
Proceedings of the Second Workshop on Shortcomings in Vision and Language

Visual storytelling is an intriguing and complex task that only recently entered the research arena. In this work, we survey relevant work to date, and conduct a thorough error analysis of three very recent approaches to visual storytelling. We categorize and provide examples of common types of errors, and identify key shortcomings in current work. Finally, we make recommendations for addressing these limitations in the future.

pdf bib
Enriching Neural Models with Targeted Features for Dementia Detection
Flavio Di Palo | Natalie Parde
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Alzheimers disease is an irreversible brain disease that slowly destroys memory skills andthinking skills leading to the need for full-time care. Early detection of Alzheimer’s dis-ease is fundamental to slow down the progress of the disease. In this work we are developing Natural Language Processing techniques to detect linguistic characteristics of patients suffering Alzheimer’s Disease and related Dementias. We are proposing a neural model based on a CNN-LSTM architecture that is able to take in consideration both long language samples and hand-crafted linguistic features to distinguish between dementia affected and healthy patients. We are exploring the effects of the introduction of an attention mechanism on both our model and the actual state of the art. Our approach is able to set a new state-of-the art on the DementiaBank dataset achieving an F1 Score of 0.929 in the Dementia patients classification Supplementary material include code to run the experiments.


pdf bib
A Corpus of Metaphor Novelty Scores for Syntactically-Related Word Pairs
Natalie Parde | Rodney Nielsen
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Detecting Sarcasm is Extremely Easy ;-)
Natalie Parde | Rodney Nielsen
Proceedings of the Workshop on Computational Semantics beyond Events and Roles

Detecting sarcasm in text is a particularly challenging problem in computational semantics, and its solution may vary across different types of text. We analyze the performance of a domain-general sarcasm detection system on datasets from two very different domains: Twitter, and Amazon product reviews. We categorize the errors that we identify with each, and make recommendations for addressing these issues in NLP systems in the future.

pdf bib
Automatically Generating Questions about Novel Metaphors in Literature
Natalie Parde | Rodney Nielsen
Proceedings of the 11th International Conference on Natural Language Generation

The automatic generation of stimulating questions is crucial to the development of intelligent cognitive exercise applications. We developed an approach that generates appropriate Questioning the Author queries based on novel metaphors in diverse syntactic relations in literature. We show that the generated questions are comparable to human-generated questions in terms of naturalness, sensibility, and depth, and score slightly higher than human-generated questions in terms of clarity. We also show that questions generated about novel metaphors are rated as cognitively deeper than questions generated about non- or conventional metaphors, providing evidence that metaphor novelty can be leveraged to promote cognitive exercise.


pdf bib
Finding Patterns in Noisy Crowds: Regression-based Annotation Aggregation for Crowdsourced Data
Natalie Parde | Rodney Nielsen
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Crowdsourcing offers a convenient means of obtaining labeled data quickly and inexpensively. However, crowdsourced labels are often noisier than expert-annotated data, making it difficult to aggregate them meaningfully. We present an aggregation approach that learns a regression model from crowdsourced annotations to predict aggregated labels for instances that have no expert adjudications. The predicted labels achieve a correlation of 0.594 with expert labels on our data, outperforming the best alternative aggregation method by 11.9%. Our approach also outperforms the alternatives on third-party datasets.