Bram Vanroy


2023

pdf bib
Adapting Machine Translation Education to the Neural Era: A Case Study of MT Quality Assessment
Lieve Macken | Bram Vanroy | Arda Tezcan
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

The use of automatic evaluation metrics to assess Machine Translation (MT) quality is well established in the translation industry. Whereas it is relatively easy to cover the word- and character-based metrics in an MT course, it is less obvious to integrate the newer neural metrics. In this paper we discuss how we introduced the topic of MT quality assessment in a course for translation students. We selected three English source texts, each having a different difficulty level and style, and let the students translate the texts into their L1 and reflect upon translation difficulty. Afterwards, the students were asked to assess MT quality for the same texts using different methods and to critically reflect upon obtained results. The students had access to the MATEO web interface, which contains word- and character-based metrics as well as neural metrics. The students used two different reference translations: their own translations and professional translations of the three texts. We not only synthesise the comments of the students, but also present the results of some cross-lingual analyses on nine different language pairs.

pdf bib
MATEO: MAchine Translation Evaluation Online
Bram Vanroy | Arda Tezcan | Lieve Macken
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

We present MAchine Translation Evaluation Online (MATEO), a project that aims to facilitate machine translation (MT) evaluation by means of an easy-to-use interface that can evaluate given machine translations with a battery of automatic metrics. It caters to both experienced and novice users who are working with MT, such as MT system builders, teachers and students of (machine) translation, and researchers.

pdf bib
SignON: Sign Language Translation. Progress and challenges.
Vincent Vandeghinste | Dimitar Shterionov | Mirella De Sisto | Aoife Brady | Mathieu De Coster | Lorraine Leeson | Josep Blat | Frankie Picron | Marcello Paolo Scipioni | Aditya Parikh | Louis ten Bosch | John O’Flaherty | Joni Dambre | Jorn Rijckaert | Bram Vanroy | Victor Ubieto Nogales | Santiago Egea Gomez | Ineke Schuurman | Gorka Labaka | Adrián Núnez-Marcos | Irene Murtagh | Euan McGill | Horacio Saggion
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

SignON (https://signon-project.eu/) is a Horizon 2020 project, running from 2021 until the end of 2023, which addresses the lack of technology and services for the automatic translation between sign languages (SLs) and spoken languages, through an inclusive, human-centric solution, hence contributing to the repertoire of communication media for deaf, hard of hearing (DHH) and hearing individuals. In this paper, we present an update of the status of the project, describing the approaches developed to address the challenges and peculiarities of SL machine translation (SLMT).

pdf bib
Are there just WordNets or also SignNets?
Ineke Schuurman | Thierry Declerck | Caro Brosens | Margot Janssens | Vincent Vandeghinste | Bram Vanroy
Proceedings of the 12th Global Wordnet Conference

For Sign Languages (SLs), can we create a SignNet, like a WordNet for spoken languages: a network of semantic relations between constitutive elements of SLs? We first discuss approaches that link SL data to wordnets, or integrate such elements with some adaptations into the structure of WordNet. Then, we present requirements for a SignNet, which is built on SL data and then linked to WordNet.

pdf bib
Proceedings of the Second International Workshop on Automatic Translation for Signed and Spoken Languages
Dimitar Shterionov | Mirella De Sisto | Mathias Muller | Davy Van Landuyt | Rehana Omardeen | Shaun Oboyle | Annelies Braffort | Floris Roelofsen | Fred Blain | Bram Vanroy | Eleftherios Avramidis
Proceedings of the Second International Workshop on Automatic Translation for Signed and Spoken Languages

2022

pdf bib
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Helena Moniz | Lieve Macken | Andrew Rufener | Loïc Barrault | Marta R. Costa-jussà | Christophe Declercq | Maarit Koponen | Ellie Kemp | Spyridon Pilos | Mikel L. Forcada | Carolina Scarton | Joachim Van den Bogaert | Joke Daems | Arda Tezcan | Bram Vanroy | Margot Fonteyne
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

pdf bib
Literary translation as a three-stage process: machine translation, post-editing and revision
Lieve Macken | Bram Vanroy | Luca Desmet | Arda Tezcan
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

This study focuses on English-Dutch literary translations that were created in a professional environment using an MT-enhanced workflow consisting of a three-stage process of automatic translation followed by post-editing and (mainly) monolingual revision. We compare the three successive versions of the target texts. We used different automatic metrics to measure the (dis)similarity between the consecutive versions and analyzed the linguistic characteristics of the three translation variants. Additionally, on a subset of 200 segments, we manually annotated all errors in the machine translation output and classified the different editing actions that were carried out. The results show that more editing occurred during revision than during post-editing and that the types of editing actions were different.

pdf bib
LeConTra: A Learner Corpus of English-to-Dutch News Translation
Bram Vanroy | Lieve Macken
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present LeConTra, a learner corpus consisting of English-to-Dutch news translations enriched with translation process data. Three students of a Master’s programme in Translation were asked to translate 50 different English journalistic texts of approximately 250 tokens each. Because we also collected translation process data in the form of keystroke logging, our dataset can be used as part of different research strands such as translation process research, learner corpus research, and corpus-based translation studies. Reference translations, without process data, are also included. The data has been manually segmented and tokenized, and manually aligned at both segment and word level, leading to a high-quality corpus with token-level process data. The data is freely accessible via the Translation Process Research DataBase, which emphasises our commitment of distributing our dataset. The tool that was built for manual sentence segmentation and tokenization, Mantis, is also available as an open-source aid for data processing.

2020

pdf bib
LT3 at SemEval-2020 Task 7: Comparing Feature-Based and Transformer-Based Approaches to Detect Funny Headlines
Bram Vanroy | Sofie Labat | Olha Kaminska | Els Lefever | Veronique Hoste
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper presents two different systems for the SemEval shared task 7 on Assessing Humor in Edited News Headlines, sub-task 1, where the aim was to estimate the intensity of humor generated in edited headlines. Our first system is a feature-based machine learning system that combines different types of information (e.g. word embeddings, string similarity, part-of-speech tags, perplexity scores, named entity recognition) in a Nu Support Vector Regressor (NuSVR). The second system is a deep learning-based approach that uses the pre-trained language model RoBERTa to learn latent features in the news headlines that are useful to predict the funniness of each headline. The latter system was also our final submission to the competition and is ranked seventh among the 49 participating teams, with a root-mean-square error (RMSE) of 0.5253.

2019

pdf bib
Modelling word translation entropy and syntactic equivalence with machine learning
Bram Vanroy | Orphée De Clercq | Lieve Macken
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production