Leo Huovinen


2024

pdf bib
Scaling Sustainable Development Goal Predictions across Languages: From English to Finnish
Melany Macias | Lev Kharlashkin, | Leo Huovinen | Mika Hämäläinen
Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages

In this paper, we leverage an exclusive English dataset to train diverse multilingual classifiers, investigating their efficacy in adapting to Finnish data. We employ an exclusively English classification dataset of UN Sustainable Development Goals (SDG) in an education context, to train various multilingual classifiers and examine how well these models can adapt to recognizing the same classes within Finnish university course descriptions. It’s worth noting that Finnish, with a mere 5 million native speakers, presents a significantly less-resourced linguistic context compared to English. The best performing model in our experiments was mBART with an F1-score of 0.843.