Prasha Shrestha


2021

pdf bib
CrossCheck: Rapid, Reproducible, and Interpretable Model Evaluation
Dustin Arendt | Zhuanyi Shaw | Prasha Shrestha | Ellyn Ayton | Maria Glenski | Svitlana Volkova
Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances

Evaluation beyond aggregate performance metrics, e.g. F1-score, is crucial to both establish an appropriate level of trust in machine learning models and identify avenues for future model improvements. In this paper we demonstrate CrossCheck, an interactive capability for rapid cross-model comparison and reproducible error analysis. We describe the tool, discuss design and implementation details, and present three NLP use cases – named entity recognition, reading comprehension, and clickbait detection that show the benefits of using the tool for model evaluation. CrossCheck enables users to make informed decisions when choosing between multiple models, identify when the models are correct and for which examples, investigate whether the models are making the same mistakes as humans, evaluate models’ generalizability and highlight models’ limitations, strengths and weaknesses. Furthermore, CrossCheck is implemented as a Jupyter widget, which allows for rapid and convenient integration into existing model development workflows.

2019

pdf bib
Jointly Learning Author and Annotated Character N-gram Embeddings: A Case Study in Literary Text
Suraj Maharjan | Deepthi Mave | Prasha Shrestha | Manuel Montes | Fabio A. González | Thamar Solorio
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

An author’s way of presenting a story through his/her writing style has a great impact on whether the story will be liked by readers or not. In this paper, we learn representations for authors of literary texts together with representations for character n-grams annotated with their functional roles. We train a neural character n-gram based language model using an external corpus of literary texts and transfer learned representations for use in downstream tasks. We show that augmenting the knowledge from external works of authors produces results competitive with other style-based methods for book likability prediction, genre classification, and authorship attribution.

2017

pdf bib
Convolutional Neural Networks for Authorship Attribution of Short Texts
Prasha Shrestha | Sebastian Sierra | Fabio González | Manuel Montes | Paolo Rosso | Thamar Solorio
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We present a model to perform authorship attribution of tweets using Convolutional Neural Networks (CNNs) over character n-grams. We also present a strategy that improves model interpretability by estimating the importance of input text fragments in the predicted classification. The experimental evaluation shows that text CNNs perform competitively and are able to outperform previous methods.

2016

pdf bib
Semi-supervised CLPsych 2016 Shared Task System Submission
Nicolas Rey-Villamizar | Prasha Shrestha | Thamar Solorio | Farig Sadeque | Steven Bethard | Ted Pedersen
Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology

pdf bib
Analysis of Anxious Word Usage on Online Health Forums
Nicolas Rey-Villamizar | Prasha Shrestha | Farig Sadeque | Steven Bethard | Ted Pedersen | Arjun Mukherjee | Thamar Solorio
Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis

pdf bib
Why Do They Leave: Modeling Participation in Online Depression Forums
Farig Sadeque | Ted Pedersen | Thamar Solorio | Prasha Shrestha | Nicolas Rey-Villamizar | Steven Bethard
Proceedings of the Fourth International Workshop on Natural Language Processing for Social Media

pdf bib
Age and Gender Prediction on Health Forum Data
Prasha Shrestha | Nicolas Rey-Villamizar | Farig Sadeque | Ted Pedersen | Steven Bethard | Thamar Solorio
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Health support forums have become a rich source of data that can be used to improve health care outcomes. A user profile, including information such as age and gender, can support targeted analysis of forum data. But users might not always disclose their age and gender. It is desirable then to be able to automatically extract this information from users’ content. However, to the best of our knowledge there is no such resource for author profiling of health forum data. Here we present a large corpus, with close to 85,000 users, for profiling and also outline our approach and benchmark results to automatically detect a user’s age and gender from their forum posts. We use a mix of features from a user’s text as well as forum specific features to obtain accuracy well above the baseline, thus showing that both our dataset and our method are useful and valid.

2015

pdf bib
Predicting Continued Participation in Online Health Forums
Farig Sadeque | Thamar Solorio | Ted Pedersen | Prasha Shrestha | Steven Bethard
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis