Scaling Sustainable Development Goal Predictions across Languages: From English to Finnish

Melany Macias, Lev Kharlashkin,, Leo Huovinen, Mika Hämäläinen


Abstract
In this paper, we leverage an exclusive English dataset to train diverse multilingual classifiers, investigating their efficacy in adapting to Finnish data. We employ an exclusively English classification dataset of UN Sustainable Development Goals (SDG) in an education context, to train various multilingual classifiers and examine how well these models can adapt to recognizing the same classes within Finnish university course descriptions. It’s worth noting that Finnish, with a mere 5 million native speakers, presents a significantly less-resourced linguistic context compared to English. The best performing model in our experiments was mBART with an F1-score of 0.843.
Anthology ID:
2024.iwclul-1.17
Volume:
Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages
Month:
November
Year:
2024
Address:
Helsinki, Finland
Editors:
Mika Hämäläinen, Flammie Pirinen, Melany Macias, Mario Crespo Avila
Venue:
IWCLUL
SIG:
SIGUR
Publisher:
Association for Computational Linguistics
Note:
Pages:
132–137
Language:
URL:
https://aclanthology.org/2024.iwclul-1.17
DOI:
Bibkey:
Cite (ACL):
Melany Macias, Lev Kharlashkin,, Leo Huovinen, and Mika Hämäläinen. 2024. Scaling Sustainable Development Goal Predictions across Languages: From English to Finnish. In Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages, pages 132–137, Helsinki, Finland. Association for Computational Linguistics.
Cite (Informal):
Scaling Sustainable Development Goal Predictions across Languages: From English to Finnish (Macias et al., IWCLUL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.iwclul-1.17.pdf