Karel D’Oosterlinck


2024

pdf bib
Updating CLIP to Prefer Descriptions Over Captions
Amir Zur | Elisa Kreiss | Karel D’Oosterlinck | Christopher Potts | Atticus Geiger
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Although CLIPScore is a powerful generic metric that captures the similarity between a text and an image, it fails to distinguish between a caption that is meant to complement the information in an image and a description that is meant to replace an image entirely, e.g., for accessibility. We address this shortcoming by updating the CLIP model with the Concadia dataset to assign higher scores to descriptions than captions using parameter efficient fine-tuning and a loss objective derived from work on causal interpretability. This model correlates with the judgements of blind and low-vision people while preserving transfer capabilities and has interpretable structure that sheds light on the caption–description distinction.

pdf bib
MSCAW-coref: Multilingual, Singleton and Conjunction-Aware Word-Level Coreference Resolution
Houjun Liu | John Bauer | Karel D’Oosterlinck | Christopher Potts | Christopher D. Manning
Proceedings of The Seventh Workshop on Computational Models of Reference, Anaphora and Coreference

2023

pdf bib
BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance
Karel D’Oosterlinck | François Remy | Johannes Deleu | Thomas Demeester | Chris Develder | Klim Zaporojets | Aneiss Ghodsi | Simon Ellershaw | Jack Collins | Christopher Potts
Findings of the Association for Computational Linguistics: EMNLP 2023

Timely and accurate extraction of Adverse Drug Events (ADE) from biomedical literature is paramount for public safety, but involves slow and costly manual labor. We set out to improve drug safety monitoring (pharmacovigilance, PV) through the use of Natural Language Processing (NLP). We introduce BioDEX, a large-scale resource for Biomedical adverse Drug Event eXtraction, rooted in the historical output of drug safety reporting in the U.S. BioDEX consists of 65k abstracts and 19k full-text biomedical papers with 256k associated document-level safety reports created by medical experts. The core features of these reports include the reported weight, age, and biological sex of a patient, a set of drugs taken by the patient, the drug dosages, the reactions experienced, and whether the reaction was life threatening. In this work, we consider the task of predicting the core information of the report given its originating paper. We estimate human performance to be 72.0% F1, whereas our best model achieves 59.1% F1 (62.3 validation), indicating significant headroom. We also begin to explore ways in which these models could help professional PV reviewers. Our code and data are available at https://github.com/KarelDO/BioDEX.

pdf bib
CAW-coref: Conjunction-Aware Word-level Coreference Resolution
Karel D’Oosterlinck | Semere Kiros Bitew | Brandon Papineau | Christopher Potts | Thomas Demeester | Chris Develder
Proceedings of The Sixth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2023)

pdf bib
Rigorously Assessing Natural Language Explanations of Neurons
Jing Huang | Atticus Geiger | Karel D’Oosterlinck | Zhengxuan Wu | Christopher Potts
Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP

Natural language is an appealing medium for explaining how large language models process and store information, but evaluating the faithfulness of such explanations is challenging. To help address this, we develop two modes of evaluation for natural language explanations that claim individual neurons represent a concept in a text input. In the *observational mode*, we evaluate claims that a neuron a activates on all and only input strings that refer to a concept picked out by the proposed explanation E. In the *intervention mode*, we construe E as a claim that neuron a is a causal mediator of the concept denoted by E. We apply our framework to the GPT-4-generated explanations of GPT-2 XL neurons of Bills et al. (2023) and show that even the most confident explanations have high error rates and little to no causal efficacy. We close the paper by critically assessing whether natural language is a good choice for explanations and whether neurons are the best level of analysis.

2021

pdf bib
Frozen Pretrained Transformers for Neural Sign Language Translation
Mathieu De Coster | Karel D’Oosterlinck | Marija Pizurica | Paloma Rabaey | Severine Verlinden | Mieke Van Herreweghe | Joni Dambre
Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL)

One of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora. Recent works have achieved promising results on the RWTH-PHOENIX-Weather 2014T dataset, which consists of over eight thousand parallel sentences between German sign language and German. However, from the perspective of neural machine translation, this is still a tiny dataset. To improve the performance of models trained on small datasets, transfer learning can be used. While this has been previously applied in sign language translation for feature extraction, to the best of our knowledge, pretrained language models have not yet been investigated. We use pretrained BERT-base and mBART-50 models to initialize our sign language video to spoken language text translation model. To mitigate overfitting, we apply the frozen pretrained transformer technique: we freeze the majority of parameters during training. Using a pretrained BERT model, we outperform a baseline trained from scratch by 1 to 2 BLEU-4. Our results show that pretrained language models can be used to improve sign language translation performance and that the self-attention patterns in BERT transfer in zero-shot to the encoder and decoder of sign language translation models.