Improving Transformer Models by Reordering their Sublayers Ofir Press author Noah A Smith author Omer Levy author 2020-07 text Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics Dan Jurafsky editor Joyce Chai editor Natalie Schluter editor Joel Tetreault editor Association for Computational Linguistics Online conference publication press-etal-2020-improving 10.18653/v1/2020.acl-main.270 https://aclanthology.org/2020.acl-main.270/ 2020-07 2996 3005