%0 Conference Proceedings %T Optimizing Deeper Transformers on Small Datasets %A Xu, Peng %A Kumar, Dhruv %A Yang, Wei %A Zi, Wenjie %A Tang, Keyi %A Huang, Chenyang %A Cheung, Jackie Chi Kit %A Prince, Simon J.D. %A Cao, Yanshuai %Y Zong, Chengqing %Y Xia, Fei %Y Li, Wenjie %Y Navigli, Roberto %S Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) %D 2021 %8 August %I Association for Computational Linguistics %C Online %F xu-etal-2021-optimizing %R 10.18653/v1/2021.acl-long.163 %U https://aclanthology.org/2021.acl-long.163/ %U https://doi.org/10.18653/v1/2021.acl-long.163 %P 2089-2102