Jelke Bloem

2024

pdf bib abs
Towards quantifying politicization in foreign aid project reports
Sidi Wang | Gustav Eggers | Alexia de Roode Torres Georgiadis | Tuan Anh Đo | Léa Gontard | Ruth Carlitz | Jelke Bloem
Proceedings of the Second Workshop on Natural Language Processing for Political Sciences @ LREC-COLING 2024

We aim to develop a metric of politicization by investigating whether this concept can be operationalized computationally using document embeddings. We are interested in measuring the extent to which foreign aid is politicized. Textual reports of foreign aid projects are often made available by donor governments, but these are large and unstructured. By embedding them in vector space, we can compute similarities between sets of known politicized keywords and the foreign aid reports. We present a pilot study where we apply this metric to USAID reports.

pdf bib abs
Automatic Animacy Classification for Romanian Nouns
Maria Tepei | Jelke Bloem
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

We introduce the first Romanian animacy classifier, specifically a type-based binary classifier of Romanian nouns into the classes human/non-human, using pre-trained word embeddings and animacy information derived from Romanian WordNet. By obtaining a seed set of labeled nouns and their embeddings, we are able to train classifiers that generalize to unseen nouns. We compare three different architectures and observe good performance on classifying word types. In addition, we manually annotate a small corpus for animacy to perform a token-based evaluation of Romanian animacy classification in a naturalistic setting, which reveals limitations of the type-based classification approach.

pdf bib abs
Impact of Task Adapting on Transformer Models for Targeted Sentiment Analysis in Croatian Headlines
Sofia Lee | Jelke Bloem
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Transformer models, such as BERT, are often taken off-the-shelf and then fine-tuned on a downstream task. Although this is sufficient for many tasks, low-resource settings require special attention. We demonstrate an approach of performing an extra stage of self-supervised task-adaptive pre-training to a number of Croatian-supporting Transformer models. In particular, we focus on approaches to language, domain, and task adaptation. The task in question is targeted sentiment analysis for Croatian news headlines. We produce new state-of-the-art results (F1 = 0.781), but the highest performing model still struggles with irony and implicature. Overall, we find that task-adaptive pre-training benefits massively multilingual models but not Croatian-dominant models.

pdf bib abs
SimLex-999 for Dutch
Lizzy Brans | Jelke Bloem
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Word embeddings revolutionised natural language processing by effectively representing words as dense vectors. Although many datasets exist to evaluate English embeddings, few cater to Dutch. We developed a Dutch variant of the SimLex-999 word similarity dataset by gathering similarity judgements from 235 native Dutch speakers. Subsequently, we evaluated two popular Dutch language models, Bertje and RobBERT, finding that Bertje showed superior alignment with human semantic similarity judgments compared to RobBERT. This study provides the first intrinsic Dutch word embedding evaluation dataset, which enables accurate assessment of these embeddings and fosters the development of effective Dutch language models.

pdf bib abs
Broadening the coverage of computational representations of metaphor through Dynamic Metaphor Theory
Xiaojuan Tan | Jelke Bloem
Proceedings of the First Workshop on Reference, Framing, and Perspective @ LREC-COLING 2024

Current approaches to computational metaphor processing typically incorporate static representations of metaphor. We aim to show that this limits the coverage of such systems. We take insights from dynamic metaphor theory and discuss how existing computational models of metaphor might benefit from representing the dynamics of metaphor when applied to the analysis of conflicting discourse. We propose that a frame-based approach to metaphor representation based on the model of YinYang Dynamics of Metaphoricity (YYDM) would pave the way to more comprehensive modeling of metaphor. In particular, the metaphoricity cues of the YYDM model could be used to address the task of dynamic metaphor identification. Frame-based modeling of dynamic metaphor would facilitate the computational analysis of perspectives in conflicting discourse, with potential applications in analyzing political discourse.

2023

pdf bib abs
Comparing domain-specific and domain-general BERT variants for inferred real-world knowledge through rare grammatical features in Serbian
Sofia Lee | Jelke Bloem
Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023)

Transfer learning is one of the prevailing approaches towards training language-specific BERT models. However, some languages have uncommon features that may prove to be challenging to more domain-general models but not domain-specific models. Comparing the performance of BERTić, a Bosnian-Croatian-Montenegrin-Serbian model, and Multilingual BERT on a Named-Entity Recognition (NER) task and Masked Language Modelling (MLM) task based around a rare phenomenon of indeclinable female foreign names in Serbian reveals how the different training approaches impacts their performance. Multilingual BERT is shown to perform better than BERTić in the NER task, but BERTić greatly exceeds in the MLM task. Thus, there are applications both for domain-general training and domain-specific training depending on the tasks at hand.

pdf bib abs
Using Collostructional Analysis to evaluate BERT’s representation of linguistic constructions
Tim Veenboer | Jelke Bloem
Findings of the Association for Computational Linguistics: ACL 2023

Collostructional analysis is a technique devised to find correlations between particular words and linguistic constructions in order to analyse meaning associations of these constructions. Contrasting collostructional analysis results with output from BERT might provide insights into the way BERT represents the meaning of linguistic constructions. This study tests to what extent English BERT’s meaning representations correspond to known constructions from the linguistics literature by means of two tasks that we propose. Firstly, by predicting the words that can be used in open slots of constructions, the meaning associations of more lexicalized constructions can be observed. Secondly, by finding similar sequences using BERT’s output embeddings and manually reviewing the resulting sentences, we can observe whether instances of less lexicalized constructions are clustered together in semantic space. These two methods show that BERT represents constructional meaning to a certain extent, but does not separate instances of a construction from a near-synonymous construction that has a different form.

2022

pdf bib abs
Domain-specific Evaluation of Word Embeddings for Philosophical Text using Direct Intrinsic Evaluation
Goya van Boven | Jelke Bloem
Proceedings of the 2nd International Workshop on Natural Language Processing for Digital Humanities

We perform a direct intrinsic evaluation of word embeddings trained on the works of a single philosopher. Six models are compared to human judgements elicited using two tasks: a synonym detection task and a coherence task. We apply a method that elicits judgements based on explicit knowledge from experts, as the linguistic intuition of non-expert participants might differ from that of the philosopher. We find that an in-domain SVD model has the best 1-nearest neighbours for target terms, while transfer learning-based Nonce2Vec performs better for low frequency target terms.

2021

pdf bib
Comparing Contextual and Static Word Embeddings with Small Data
Wei Zhou | Jelke Bloem
Proceedings of the 17th Conference on Natural Language Processing (KONVENS 2021)

pdf bib abs
Challenging distributional models with a conceptual network of philosophical terms
Yvette Oortwijn | Jelke Bloem | Pia Sommerauer | Francois Meyer | Wei Zhou | Antske Fokkens
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Computational linguistic research on language change through distributional semantic (DS) models has inspired researchers from fields such as philosophy and literary studies, who use these methods for the exploration and comparison of comparatively small datasets traditionally analyzed by close reading. Research on methods for small data is still in early stages and it is not clear which methods achieve the best results. We investigate the possibilities and limitations of using distributional semantic models for analyzing philosophical data by means of a realistic use-case. We provide a ground truth for evaluation created by philosophy experts and a blueprint for using DS models in a sound methodological setup. We compare three methods for creating specialized models from small datasets. Though the models do not perform well enough to directly support philosophers yet, we find that models designed for small data yield promising directions for future work.

pdf bib abs
Eliciting Explicit Knowledge From Domain Experts in Direct Intrinsic Evaluation of Word Embeddings for Specialized Domains
Goya van Boven | Jelke Bloem
Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)

We evaluate the use of direct intrinsic word embedding evaluation tasks for specialized language. Our case study is philosophical text: human expert judgements on the relatedness of philosophical terms are elicited using a synonym detection task and a coherence task. Uniquely for our task, experts must rely on explicit knowledge and cannot use their linguistic intuition, which may differ from that of the philosopher. We find that inter-rater agreement rates are similar to those of more conventional semantic annotation tasks, suggesting that these tasks can be used to evaluate word embeddings of text types for which implicit knowledge may not suffice.

2020

pdf bib abs
Distributional Semantics for Neo-Latin
Jelke Bloem | Maria Chiara Parisi | Martin Reynaert | Yvette Oortwijn | Arianna Betti
Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages

We address the problem of creating and evaluating quality Neo-Latin word embeddings for the purpose of philosophical research, adapting the Nonce2Vec tool to learn embeddings from Neo-Latin sentences. This distributional semantic modeling tool can learn from tiny data incrementally, using a larger background corpus for initialization. We conduct two evaluation tasks: definitional learning of Latin Wikipedia terms, and learning consistent embeddings from 18th century Neo-Latin sentences pertaining to the concept of mathematical method. Our results show that consistent Neo-Latin word embeddings can be learned from this type of data. While our evaluation results are promising, they do not reveal to what extent the learned models match domain expert knowledge of our Neo-Latin texts. Therefore, we propose an additional evaluation method, grounded in expert-annotated data, that would assess whether learned representations are conceptually sound in relation to the domain of study.

pdf bib abs
Expert Concept-Modeling Ground Truth Construction for Word Embeddings Evaluation in Concept-Focused Domains
Arianna Betti | Martin Reynaert | Thijs Ossenkoppele | Yvette Oortwijn | Andrew Salway | Jelke Bloem
Proceedings of the 28th International Conference on Computational Linguistics

We present a novel, domain expert-controlled, replicable procedure for the construction of concept-modeling ground truths with the aim of evaluating the application of word embeddings. In particular, our method is designed to evaluate the application of word and paragraph embeddings in concept-focused textual domains, where a generic ontology does not provide enough information. We illustrate the procedure, and validate it by describing the construction of an expert ground truth, QuiNE-GT. QuiNE-GT is built to answer research questions concerning the concept of naturalized epistemology in QUINE, a 2-million-token, single-author, 20th-century English philosophy corpus of outstanding quality, cleaned up and enriched for the purpose. To the best of our ken, expert concept-modeling ground truths are extremely rare in current literature, nor has the theoretical methodology behind their construction ever been explicitly conceptualised and properly systematised. Expert-controlled concept-modeling ground truths are however essential to allow proper evaluation of word embeddings techniques, and increase their trustworthiness in specialised domains in which the detection of concepts through their expression in texts is important. We highlight challenges, requirements, and prospects for future work.

2019

pdf bib abs
Evaluating the Consistency of Word Embeddings from Small Data
Jelke Bloem | Antske Fokkens | Aurélie Herbelot
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)

In this work, we address the evaluation of distributional semantic models trained on smaller, domain-specific texts, specifically, philosophical text. Specifically, we inspect the behaviour of models using a pre-trained background space in learning. We propose a measure of consistency which can be used as an evaluation metric when no in-domain gold-standard data is available. This measure simply computes the ability of a model to learn similar embeddings from different parts of some homogeneous data. We show that in spite of being a simple evaluation, consistency actually depends on various combinations of factors, including the nature of the data itself, the model used to train the semantic space, and the frequency of the learnt terms, both in the background space and in the in-domain data of interest.

pdf bib abs
Modeling a Historical Variety of a Low-Resource Language: Language Contact Effects in the Verbal Cluster of Early-Modern Frisian
Jelke Bloem | Arjen Versloot | Fred Weerman
Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change

Certain phenomena of interest to linguists mainly occur in low-resource languages, such as contact-induced language change. We show that it is possible to study contact-induced language change computationally in a historical variety of a low-resource language, Early-Modern Frisian, by creating a model using features that were established to be relevant in a closely related language, modern Dutch. This allows us to test two hypotheses on two types of language contact that may have taken place between Frisian and Dutch during this time. Our model shows that Frisian verb cluster word orders are associated with different context features than Dutch verb orders, supporting the ‘learned borrowing’ hypothesis.

2016

pdf bib abs
Testing the Processing Hypothesis of word order variation using a probabilistic language model
Jelke Bloem
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

This work investigates the application of a measure of surprisal to modeling a grammatical variation phenomenon between near-synonymous constructions. We investigate a particular variation phenomenon, word order variation in Dutch two-verb clusters, where it has been established that word order choice is affected by processing cost. Several multifactorial corpus studies of Dutch verb clusters have used other measures of processing complexity to show that this factor affects word order choice. This previous work allows us to compare the surprisal measure, which is based on constraint satisfaction theories of language modeling, to those previously used measures, which are more directly linked to empirical observations of processing complexity. Our results show that surprisal does not predict the word order choice by itself, but is a significant predictor when used in a measure of uniform information density (UID). This lends support to the view that human language processing is facilitated not so much by predictable sequences of words but more by sequences of words in which information is spread evenly.