Xiangnan Kong


2020

pdf bib
Human Attention Maps for Text Classification: Do Humans and Neural Networks Focus on the Same Words?
Cansu Sen | Thomas Hartvigsen | Biao Yin | Xiangnan Kong | Elke Rundensteiner
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Motivated by human attention, computational attention mechanisms have been designed to help neural networks adjust their focus on specific parts of the input data. While attention mechanisms are claimed to achieve interpretability, little is known about the actual relationships between machine and human attention. In this work, we conduct the first quantitative assessment of human versus computational attention mechanisms for the text classification task. To achieve this, we design and conduct a large-scale crowd-sourcing study to collect human attention maps that encode the parts of a text that humans focus on when conducting text classification. Based on this new resource of human attention dataset for text classification, YELP-HAT, collected on the publicly available YELP dataset, we perform a quantitative comparative analysis of machine attention maps created by deep learning models and human attention maps. Our analysis offers insights into the relationships between human versus machine attention maps along three dimensions: overlap in word selections, distribution over lexical categories, and context-dependency of sentiment polarity. Our findings open promising future research opportunities ranging from supervised attention to the design of human-centric attention-based explanations.

pdf bib
A Dual-Attention Network for Joint Named Entity Recognition and Sentence Classification of Adverse Drug Events
Susmitha Wunnava | Xiao Qin | Tabassum Kakar | Xiangnan Kong | Elke Rundensteiner
Findings of the Association for Computational Linguistics: EMNLP 2020

An adverse drug event (ADE) is an injury resulting from medical intervention related to a drug. Automatic ADE detection from text is either fine-grained (ADE entity recognition) or coarse-grained (ADE assertive sentence classification), with limited efforts leveraging inter-dependencies among the two granularities. We instead propose a multi-grained joint deep network to concurrently learn the ADE entity recognition and ADE sentence classification tasks. Our joint approach takes advantage of their symbiotic relationship, with a transfer of knowledge between the two levels of granularity. Our dual-attention mechanism constructs multiple distinct representations of a sentence that capture both task-specific and semantic information in the sentence, providing stronger emphasis on the key elements essential for sentence classification. Our model improves state-of- art F1-score for both tasks: (i) entity recognition of ADE words (12.5% increase) and (ii) ADE sentence classification (13.6% increase) on MADE 1.0 benchmark of EHR notes.