Martin Malmsten
2023
Superlim: A Swedish Language Understanding Evaluation Benchmark
Aleksandrs Berdicevskis
|
Gerlof Bouma
|
Robin Kurtz
|
Felix Morger
|
Joey Öhman
|
Yvonne Adesam
|
Lars Borin
|
Dana Dannélls
|
Markus Forsberg
|
Tim Isbister
|
Anna Lindahl
|
Martin Malmsten
|
Faton Rekathati
|
Magnus Sahlgren
|
Elena Volodina
|
Love Börjeson
|
Simon Hengchen
|
Nina Tahmasebi
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
We present Superlim, a multi-task NLP benchmark and analysis platform for evaluating Swedish language models, a counterpart to the English-language (Super)GLUE suite. We describe the dataset, the tasks, the leaderboard and report the baseline results yielded by a reference implementation. The tested models do not approach ceiling performance on any of the tasks, which suggests that Superlim is truly difficult, a desirable quality for a benchmark. We address methodological challenges, such as mitigating the Anglocentric bias when creating datasets for a less-resourced language; choosing the most appropriate measures; documenting the datasets and making the leaderboard convenient and transparent. We also highlight other potential usages of the dataset, such as, for instance, the evaluation of cross-lingual transfer learning.