Optimizing Deeper Transformers on Small Datasets Peng Xu author Dhruv Kumar author Wei Yang author Wenjie Zi author Keyi Tang author Chenyang Huang author Jackie Chi Kit Cheung author Simon J.D. Prince author Yanshuai Cao author 2021-08 text Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Chengqing Zong editor Fei Xia editor Wenjie Li editor Roberto Navigli editor Association for Computational Linguistics Online conference publication xu-etal-2021-optimizing 10.18653/v1/2021.acl-long.163 https://aclanthology.org/2021.acl-long.163/ 2021-08 2089 2102