Tim Isbister


2024

pdf bib
GPT-SW3: An Autoregressive Language Model for the Scandinavian Languages
Ariel Ekgren | Amaru Cuba Gyllensten | Felix Stollenwerk | Joey Öhman | Tim Isbister | Evangelia Gogoulou | Fredrik Carlsson | Judit Casademont | Magnus Sahlgren
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

This paper details the process of developing the first native large generative language model for the North Germanic languages, GPT-SW3. We cover all parts of the development process, from data collection and processing, training configuration and instruction finetuning, to evaluation, applications, and considerations for release strategies. We discuss pros and cons of developing large language models for smaller languages and in relatively peripheral regions of the globe, and we hope that this paper can serve as a guide and reference for other researchers that undertake the development of large generative models for smaller languages.

2023

pdf bib
Superlim: A Swedish Language Understanding Evaluation Benchmark
Aleksandrs Berdicevskis | Gerlof Bouma | Robin Kurtz | Felix Morger | Joey Öhman | Yvonne Adesam | Lars Borin | Dana Dannélls | Markus Forsberg | Tim Isbister | Anna Lindahl | Martin Malmsten | Faton Rekathati | Magnus Sahlgren | Elena Volodina | Love Börjeson | Simon Hengchen | Nina Tahmasebi
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

We present Superlim, a multi-task NLP benchmark and analysis platform for evaluating Swedish language models, a counterpart to the English-language (Super)GLUE suite. We describe the dataset, the tasks, the leaderboard and report the baseline results yielded by a reference implementation. The tested models do not approach ceiling performance on any of the tasks, which suggests that Superlim is truly difficult, a desirable quality for a benchmark. We address methodological challenges, such as mitigating the Anglocentric bias when creating datasets for a less-resourced language; choosing the most appropriate measures; documenting the datasets and making the leaderboard convenient and transparent. We also highlight other potential usages of the dataset, such as, for instance, the evaluation of cross-lingual transfer learning.

2022

pdf bib
Cross-lingual Transfer of Monolingual Models
Evangelia Gogoulou | Ariel Ekgren | Tim Isbister | Magnus Sahlgren
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Recent studies in cross-lingual learning using multilingual models have cast doubt on the previous hypothesis that shared vocabulary and joint pre-training are the keys to cross-lingual generalization. We introduce a method for transferring monolingual models to other languages through continuous pre-training and study the effects of such transfer from four different languages to English. Our experimental results on GLUE show that the transferred models outperform an English model trained from scratch, independently of the source language. After probing the model representations, we find that model knowledge from the source language enhances the learning of syntactic and semantic knowledge in English.

2021

pdf bib
Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?
Tim Isbister | Fredrik Carlsson | Magnus Sahlgren
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)

Most work in NLP makes the assumption that it is desirable to develop solutions in the native language in question. There is consequently a strong trend towards building native language models even for low-resource languages. This paper questions this development, and explores the idea of simply translating the data into English, thereby enabling the use of pretrained, and large-scale, English language models. We demonstrate empirically that a large English language model coupled with modern machine translation outperforms native language models in most Scandinavian languages. The exception to this is Finnish, which we assume is due to inferior translation quality. Our results suggest that machine translation is a mature technology, which raises a serious counter-argument for training native language models for low-resource languages. This paper therefore strives to make a provocative but important point. As English language models are improving at an unprecedented pace, which in turn improves machine translation, it is from an empirical and environmental stand-point more effective to translate data from low-resource languages into English, than to build language models for such languages.

2019

pdf bib
Dick-Preston and Morbo at SemEval-2019 Task 4: Transfer Learning for Hyperpartisan News Detection
Tim Isbister | Fredrik Johansson
Proceedings of the 13th International Workshop on Semantic Evaluation

In a world of information operations, influence campaigns, and fake news, classification of news articles as following hyperpartisan argumentation or not is becoming increasingly important. We present a deep learning-based approach in which a pre-trained language model has been fine-tuned on domain-specific data and used for classification of news articles, as part of the SemEval-2019 task on hyperpartisan news detection. The suggested approach yields accuracy and F1-scores around 0.8 which places the best performing classifier among the top-5 systems in the competition.

2018

pdf bib
Learning Representations for Detecting Abusive Language
Magnus Sahlgren | Tim Isbister | Fredrik Olsson
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)

This paper discusses the question whether it is possible to learn a generic representation that is useful for detecting various types of abusive language. The approach is inspired by recent advances in transfer learning and word embeddings, and we learn representations from two different datasets containing various degrees of abusive language. We compare the learned representation with two standard approaches; one based on lexica, and one based on data-specific n-grams. Our experiments show that learned representations do contain useful information that can be used to improve detection performance when training data is limited.