Mojca Brglez


2024

pdf bib
How Human-Like Are Word Associations in Generative Models? An Experiment in Slovene
Špela Vintar | Mojca Brglez | Aleš Žagar
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024

Large language models (LLMs) show extraordinary performance in a broad range of cognitive tasks, yet their capability to reproduce human semantic similarity judgements remains disputed. We report an experiment in which we fine-tune two LLMs for Slovene, a monolingual SloT5 and a multilingual mT5, as well as an mT5 for English, to generate word associations. The models are fine-tuned on human word association norms created within the Small World of Words project, which recently started to collect data for Slovene. Since our aim was to explore differences between human and model-generated outputs, the model parameters were minimally adjusted to fit the association task. We perform automatic evaluation using a set of methods to measure the overlap and ranking, and in addition a subset of human and model-generated responses were manually classified into four categories (meaning-, positionand form-based, and erratic). Results show that human-machine overlap is very small, but that the models produce a similar distribution of association categories as humans.

pdf bib
A Computational Analysis of the Dehumanisation of Migrants from Syria and Ukraine in Slovene News Media
Jaya Caporusso | Damar Hoogland | Mojca Brglez | Boshko Koloski | Matthew Purver | Senja Pollak
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Dehumanisation involves the perception and/or treatment of a social group’s members as less than human. This phenomenon is rarely addressed with computational linguistic techniques. We adapt a recently proposed approach for English, making it easier to transfer to other languages and to evaluate, introducing a new sentiment resource, the use of zero-shot cross-lingual valence and arousal detection, and a new method for statistical significance testing. We then apply it to study attitudes to migration expressed in Slovene newspapers, to examine changes in the Slovene discourse on migration between the 2015-16 migration crisis following the war in Syria and the 2022-23 period following the war in Ukraine. We find that while this discourse became more negative and more intense over time, it is less dehumanising when specifically addressing Ukrainian migrants compared to others.

2023

pdf bib
Dispersing the clouds of doubt: can cosine similarity of word embeddings help identify relation-level metaphors in Slovene?
Mojca Brglez
Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023)

Word embeddings and pre-trained language models have achieved great performance in many tasks due to their ability to capture both syntactic and semantic information in their representations. The vector space representations have also been used to identify figurative language shifts such as metaphors, however, the more recent contextualized models have mostly been evaluated via their performance on downstream tasks. In this article, we evaluate static and contextualized word embeddings in terms of their representation and unsupervised identification of relation-level (ADJ-NOUN, NOUN-NOUN) metaphors in Slovene on a set of 24 literal and 24 metaphorical phrases. Our experiments show very promising results for both embedding methods, however, the performance in contextual embeddings notably depends on the layer involved and the input provided to the model.

2022

pdf bib
Extracting and Analysing Metaphors in Migration Media Discourse: towards a Metaphor Annotation Scheme
Ana Zwitter Vitez | Mojca Brglez | Marko Robnik Šikonja | Tadej Škvorc | Andreja Vezovnik | Senja Pollak
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The study of metaphors in media discourse is an increasingly researched topic as media are an important shaper of social reality and metaphors are an indicator of how we think about certain issues through references to other things. We present a neural transfer learning method for detecting metaphorical sentences in Slovene and evaluate its performance on a gold standard corpus of metaphors (classification accuracy of 0.725), as well as on a sample of a domain specific corpus of migrations (precision of 0.40 for extracting domain metaphors and 0.74 if evaluated only on a set of migration related sentences). Based on empirical results and findings of our analysis, we propose a novel metaphor annotation scheme containing linguistic level, conceptual level, and stance information. The new scheme can be used for future metaphor annotations of other socially relevant topics.