Differentiable Window for Dynamic Local Attention

Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, Xiaoli Li


Abstract
We propose Differentiable Window, a new neural module and general purpose component for dynamic window selection. While universally applicable, we demonstrate a compelling use case of utilizing Differentiable Window to improve standard attention modules by enabling more focused attentions over the input regions. We propose two variants of Differentiable Window, and integrate them within the Transformer architecture in two novel ways. We evaluate our proposed approach on a myriad of NLP tasks, including machine translation, sentiment analysis, subject-verb agreement and language modeling. Our experimental results demonstrate consistent and sizable improvements across all tasks.
Anthology ID:
2020.acl-main.589
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6589–6599
Language:
URL:
https://aclanthology.org/2020.acl-main.589
DOI:
10.18653/v1/2020.acl-main.589
Bibkey:
Cite (ACL):
Thanh-Tung Nguyen, Xuan-Phi Nguyen, Shafiq Joty, and Xiaoli Li. 2020. Differentiable Window for Dynamic Local Attention. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6589–6599, Online. Association for Computational Linguistics.
Cite (Informal):
Differentiable Window for Dynamic Local Attention (Nguyen et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.589.pdf
Video:
 http://slideslive.com/38928814
Data
IMDb Movie ReviewsSST