Ruben van de Vijver


2024

In this paper we explore how morphological information can be extracted from fastText embeddings for Russian nouns. We investigate the negative effects of syncretism and propose ways of modifying the vectors that can help to find better representations for morphological functions and thus for out of vocabulary words. In particular, we look at the effect of analysing shift vectors instead of original vectors, discuss various possibilities of finding base forms to create shift vectors, and show that using only the high frequency data is beneficial when looking for structure with respect to the morphosyntactic functions in the embeddings.

2022

This paper presents the submission by the HeiMorph team to the SIGMORPHON 2022 task 2 of Morphological Acquisition Trajectories. Across all experimental conditions, we have found no evidence for the so-called Ushaped development trajectory. Our submitted systems achieve an average test accuracies of 55.5% on Arabic, 67% on German and 73.38% on English. We found that, bigram hallucination provides better inferences only for English and Arabic and only when the number of hallucinations remains low.