Aurora-M: Open Source Continual Pre-training for Multilingual Language and Code Taishi Nakamura author Mayank Mishra author Simone Tedeschi author Yekun Chai author Jason T Stillerman author Felix Friedrich author Prateek Yadav author Tanmay Laud author Vu Minh Chien author Terry Yue Zhuo author Diganta Misra author Ben Bogin author Xuan-Son Vu author Marzena Karpinska author Arnav Varma Dantuluri author Wojciech Kusa author Tommaso Furlanello author Rio Yokota author Niklas Muennighoff author Suhas Pai author Tosin Adewumi author Veronika Laippala author Xiaozhe Yao author Adalberto Barbosa Junior author Aleksandr Drozd author Jordan Clive author Kshitij Gupta author Liangyu Chen author Qi Sun author Ken Tsui author Nour Moustafa-Fahmy author Nicolo Monti author Tai Dang author Ziyang Luo author Tien-Tung Bui author Roberto Navigli author Virendra Mehta author Matthew Blumberg author Victor May author Hiep Nguyen author Sampo Pyysalo author 2025-01 text Proceedings of the 31st International Conference on Computational Linguistics: Industry Track Owen Rambow editor Leo Wanner editor Marianna Apidianaki editor Hend Al-Khalifa editor Barbara Di Eugenio editor Steven Schockaert editor Kareem Darwish editor Apoorv Agarwal editor Association for Computational Linguistics Abu Dhabi, UAE conference publication nakamura-etal-2025-aurora https://aclanthology.org/2025.coling-industry.56/ 2025-01 656 678