MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages Marco Gaido author Sara Papi author Luisa Bentivogli author Alessio Brutti author Mauro Cettolo author Roberto Gretter author Marco Matassoni author Mohamed Nabih author Matteo Negri author 2024-11 text Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing Yaser Al-Onaizan editor Mohit Bansal editor Yun-Nung Chen editor Association for Computational Linguistics Miami, Florida, USA conference publication gaido-etal-2024-mosel 10.18653/v1/2024.emnlp-main.771 https://aclanthology.org/2024.emnlp-main.771/ 2024-11 13934 13947