HybridBERT - Making BERT Pretraining More Efficient Through Hybrid Mixture of Attention Mechanisms Gokul Srinivasagan author Simon Ostermann author 2024-06 text Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop) Yang (Trista) Cao editor Isabel Papadimitriou editor Anaelia Ovalle editor Marcos Zampieri editor Francis Ferraro editor Swabha Swayamdipta editor Association for Computational Linguistics Mexico City, Mexico conference publication srinivasagan-ostermann-2024-hybridbert 10.18653/v1/2024.naacl-srw.30 https://aclanthology.org/2024.naacl-srw.30/ 2024-06 285 291