Severine Verlinden


2022

pdf bib
Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish
Ariel Ekgren | Amaru Cuba Gyllensten | Evangelia Gogoulou | Alice Heiman | Severine Verlinden | Joey Öhman | Fredrik Carlsson | Magnus Sahlgren
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present GTP-SW3, a 3.5 billion parameter autoregressive language model, trained on a newly created 100 GB Swedish corpus. This paper provides insights with regards to data collection and training, while highlights the challenges of proper model evaluation. The results of quantitive evaluation through perplexity indicate that GPT-SW3 is a competent model in comparison with existing autoregressive models of similar size. Additionally, we perform an extensive prompting study which reveals the good text generation capabilities of GTP-SW3.

pdf bib
Fine-Grained Controllable Text Generation Using Non-Residual Prompting
Fredrik Carlsson | Joey Öhman | Fangyu Liu | Severine Verlinden | Joakim Nivre | Magnus Sahlgren
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The introduction of immensely large Causal Language Models (CLMs) has rejuvenated the interest in open-ended text generation. However, controlling the generative process for these Transformer-based models is at large an unsolved problem. Earlier work has explored either plug-and-play decoding strategies, or more powerful but blunt approaches such as prompting. There hence currently exists a trade-off between fine-grained control, and the capability for more expressive high-level instructions. To alleviate this trade-off, we propose an encoder-decoder architecture that enables intermediate text prompts at arbitrary time steps. We propose a resource-efficient method for converting a pre-trained CLM into this architecture, and demonstrate its potential on various experiments, including the novel task of contextualized word inclusion. Our method provides strong results on multiple experimental settings, proving itself to be both expressive and versatile.

2021

pdf bib
Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution
Severine Verlinden | Klim Zaporojets | Johannes Deleu | Thomas Demeester | Chris Develder
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Frozen Pretrained Transformers for Neural Sign Language Translation
Mathieu De Coster | Karel D’Oosterlinck | Marija Pizurica | Paloma Rabaey | Severine Verlinden | Mieke Van Herreweghe | Joni Dambre
Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL)

One of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora. Recent works have achieved promising results on the RWTH-PHOENIX-Weather 2014T dataset, which consists of over eight thousand parallel sentences between German sign language and German. However, from the perspective of neural machine translation, this is still a tiny dataset. To improve the performance of models trained on small datasets, transfer learning can be used. While this has been previously applied in sign language translation for feature extraction, to the best of our knowledge, pretrained language models have not yet been investigated. We use pretrained BERT-base and mBART-50 models to initialize our sign language video to spoken language text translation model. To mitigate overfitting, we apply the frozen pretrained transformer technique: we freeze the majority of parameters during training. Using a pretrained BERT model, we outperform a baseline trained from scratch by 1 to 2 BLEU-4. Our results show that pretrained language models can be used to improve sign language translation performance and that the self-attention patterns in BERT transfer in zero-shot to the encoder and decoder of sign language translation models.