On the importance of pre-training data volume for compact language models Vincent Micheli author Martin d’Hoffschmidt author François Fleuret author 2020-11 text Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) Bonnie Webber editor Trevor Cohn editor Yulan He editor Yang Liu editor Association for Computational Linguistics Online conference publication micheli-etal-2020-importance 10.18653/v1/2020.emnlp-main.632 https://aclanthology.org/2020.emnlp-main.632/ 2020-11 7853 7858