2021
pdf
bib
abs
ToxCCIn: Toxic Content Classification with Interpretability
Tong Xiang
|
Sean MacAvaney
|
Eugene Yang
|
Nazli Goharian
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
Despite the recent successes of transformer-based models in terms of effectiveness on a variety of tasks, their decisions often remain opaque to humans. Explanations are particularly important for tasks like offensive language or toxicity detection on social media because a manual appeal process is often in place to dispute automatically flagged content. In this work, we propose a technique to improve the interpretability of these models, based on a simple and powerful assumption: a post is at least as toxic as its most toxic span. We incorporate this assumption into transformer models by scoring a post based on the maximum toxicity of its spans and augmenting the training process to identify correct spans. We find this approach effective and can produce explanations that exceed the quality of those provided by Logistic Regression analysis (often regarded as a highly-interpretable model), according to a human study.
2020
pdf
bib
abs
GUIR at SemEval-2020 Task 12: Domain-Tuned Contextualized Models for Offensive Language Detection
Sajad Sotudeh
|
Tong Xiang
|
Hao-Ren Yao
|
Sean MacAvaney
|
Eugene Yang
|
Nazli Goharian
|
Ophir Frieder
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Offensive language detection is an important and challenging task in natural language processing. We present our submissions to the OffensEval 2020 shared task, which includes three English sub-tasks: identifying the presence of offensive language (Sub-task A), identifying the presence of target in offensive language (Sub-task B), and identifying the categories of the target (Sub-task C). Our experiments explore using a domain-tuned contextualized language model (namely, BERT) for this task. We also experiment with different components and configurations (e.g., a multi-view SVM) stacked upon BERT models for specific sub-tasks. Our submissions achieve F1 scores of 91.7% in Sub-task A, 66.5% in Sub-task B, and 63.2% in Sub-task C. We perform an ablation study which reveals that domain tuning considerably improves the classification performance. Furthermore, error analysis shows common misclassification errors made by our model and outlines research directions for future.
2015
pdf
bib
Identifying Political Sentiment between Nation States with Social Media
Nathanael Chambers
|
Victor Bowen
|
Ethan Genco
|
Xisen Tian
|
Eric Young
|
Ganesh Harihara
|
Eugene Yang
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
2013
pdf
bib
USNA: A Dual-Classifier Approach to Contextual Sentiment Analysis
Ganesh Harihara
|
Eugene Yang
|
Nathanael Chambers
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)