Extremely Small BERT Models from Mixed-Vocabulary Training Sanqiang Zhao author Raghav Gupta author Yang Song author Denny Zhou author 2021-04 text Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume Paola Merlo editor Jorg Tiedemann editor Reut Tsarfaty editor Association for Computational Linguistics Online conference publication zhao-etal-2021-extremely 10.18653/v1/2021.eacl-main.238 https://aclanthology.org/2021.eacl-main.238/ 2021-04 2753 2759