Cascaded Semantic and Positional Self-Attention Network for Document Classification

Juyong Jiang; Jie Zhang; Kai Zhang

doi:10.18653/v1/2020.findings-emnlp.59

Cascaded Semantic and Positional Self-Attention Network for Document Classification

Abstract

Transformers have shown great success in learning representations for language modelling. However, an open challenge still remains on how to systematically aggregate semantic information (word embedding) with positional (or temporal) information (word orders). In this work, we propose a new architecture to aggregate the two sources of information using cascaded semantic and positional self-attention network (CSPAN) in the context of document classification. The CSPAN uses a semantic self-attention layer cascaded with Bi-LSTM to process the semantic and positional information in a sequential manner, and then adaptively combine them together through a residue connection. Compared with commonly used positional encoding schemes, CSPAN can exploit the interaction between semantics and word positions in a more interpretable and adaptive manner, and the classification performance can be notably improved while simultaneously preserving a compact model size and high convergence rate. We evaluate the CSPAN model on several benchmark data sets for document classification with careful ablation studies, and demonstrate the encouraging results compared with state of the art.

Anthology ID:: 2020.findings-emnlp.59
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2020
Month:: November
Year:: 2020
Address:: Online
Editors:: Trevor Cohn, Yulan He, Yang Liu
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 669–677
Language:
URL:: https://aclanthology.org/2020.findings-emnlp.59
DOI:: 10.18653/v1/2020.findings-emnlp.59
Bibkey:
Cite (ACL):: Juyong Jiang, Jie Zhang, and Kai Zhang. 2020. Cascaded Semantic and Positional Self-Attention Network for Document Classification. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 669–677, Online. Association for Computational Linguistics.
Cite (Informal):: Cascaded Semantic and Positional Self-Attention Network for Document Classification (Jiang et al., Findings 2020)
Copy Citation:
PDF:: https://aclanthology.org/2020.findings-emnlp.59.pdf
Data: Yahoo! Answers, Yelp Review Polarity

PDF Cite Search