Increasing Learning Efficiency of Self-Attention Networks through Direct Position Interactions, Learnable Temperature, and Convoluted Attention Philipp Dufter author Martin Schmitt author Hinrich Schütze author 2020-12 text Proceedings of the 28th International Conference on Computational Linguistics Donia Scott editor Nuria Bel editor Chengqing Zong editor International Committee on Computational Linguistics Barcelona, Spain (Online) conference publication dufter-etal-2020-increasing 10.18653/v1/2020.coling-main.324 https://aclanthology.org/2020.coling-main.324/ 2020-12 3630 3636