Lieve Macken

2025

pdf bib
Proceedings of the Second Workshop on Creative-text Translation and Technology (CTT)
Bram Vanroy | Marie-Aude Lefer | Lieve Macken | Paola Ruffo | Ana Guerberof Arenas | Damien Hansen
Proceedings of the Second Workshop on Creative-text Translation and Technology (CTT)

pdf bib abs
The Role of Translation Workflows in Overcoming Translation Difficulties: A Comparative Analysis of Human and Machine Translation (Post-Editing) Approaches
Lieve Macken | Paola Ruffo | Joke Daems
Proceedings of the Second Workshop on Creative-text Translation and Technology (CTT)

This study investigates the impact of different translation workflows and underlying machine translation technologies on the translation strategies used in literary translations. We compare human translation, translation within a computer-assisted translation (CAT) tool, and machine translation post-editing (MTPE), alongside neural machine translation (NMT) and large language models (LLMs). Using three short stories translated from English into Dutch, we annotated translation difficulties and strategies employed to overcome them. Our analysis reveals differences in translation solutions across modalities, highlighting the influence of technology on the final translation. The findings suggest that while MTPE tends to produce more literal translations, human translators and CAT tools exhibit greater creativity and employ more non-literal translation strategies. Additionally, LLMs reduced the number of literal translation solutions compared to traditional NMT systems. While our study provides valuable insights, it is limited by the use of only three texts and a single language pair. Further research is needed to explore these dynamics across a broader range of texts and languages, to better understand the full impact of translation workflows and technologies on literary translation.

pdf bib abs
Can Peter Pan Survive MT? A Stylometric Study of LLMs, NMTs, and HTs in Children’s Literature Translation
Delu Kong | Lieve Macken
Proceedings of the Second Workshop on Creative-text Translation and Technology (CTT)

This study focuses on evaluating the performance of machine translations (MTs) compared to human translations (HTs) in children’s literature translation (CLT) from a stylometric perspective. The research constructs a extitPeter Pan corpus, comprising 21 translations: 7 human translations (HTs), 7 large language model translations (LLMs), and 7 neural machine translation outputs (NMTs). The analysis employs a generic feature set (including lexical, syntactic, readability, and n-gram features) and a creative text translation (CTT-specific) feature set, which captures repetition, rhyme, translatability, and miscellaneous levels, yielding 447 linguistic features in total. Using classification and clustering techniques in machine learning, we conduct a stylometric analysis of these translations. Results reveal that in generic features, HTs and MTs exhibit significant differences in conjunction word distributions and the ratio of 1-word-gram-一样, while NMTs and LLMs show significant variation in descriptive words usage and adverb ratios. Regarding CTT-specific features, LLMs outperform NMTs in distribution, aligning more closely with HTs in stylistic characteristics, demonstrating the potential of LLMs in CLT.

pdf bib abs
Decoding Machine Translationese in English-Chinese News: LLMs vs. NMTs
Delu Kong | Lieve Macken
Proceedings of Machine Translation Summit XX: Volume 1

This study explores Machine Translationese (MTese) — the linguistic peculiarities of machine translation outputs — focusing on the under-researched English-to-Chinese language pair in news texts. We construct a large dataset consisting of 4 sub-corpora and employ a comprehensive five-layer feature set. Then, a chi-square ranking algorithm is applied for feature selection in both classification and clustering tasks. Our findings confirm the presence of MTese in both Neural Machine Translation systems (NMTs) and Large Language Models (LLMs). Original Chinese texts are nearly perfectly distinguishable from both LLM and NMT outputs. Notable linguistic patterns in MT outputs are shorter sentence lengths and increased use of adversative conjunctions. Comparing LLMs and NMTs, we achieve approximately 70% classification accuracy, with LLMs exhibiting greater lexical diversity and NMTs using more brackets. Additionally, translation-specific LLMs show lower lexical diversity but higher usage of causal conjunctions compared to generic LLMs. Lastly, we find no significant differences between LLMs developed by Chinese firms and their foreign counterparts.

We present key interim findings from the ongoing MaTIAS project, which focuses on developing a multilingual notification system for asylum reception centres in Belgium. This system integrates machine translation (MT) to enable staff to provide practical information to residents in their native language, thus fostering more effective communication. Our discussion focuses on three key aspects: the development of the multilingual messaging platform, the types of messages the system is designed to handle, and the evaluation of potential MT systems for integration.

2024

pdf bib
Proceedings of the 1st Workshop on Creative-text Translation and Technology
Bram Vanroy | Marie-Aude Lefer | Lieve Macken | Paola Ruffo
Proceedings of the 1st Workshop on Creative-text Translation and Technology

pdf bib abs
Impact of translation workflows with and without MT on textual characteristics in literary translation
Joke Daems | Paola Ruffo | Lieve Macken
Proceedings of the 1st Workshop on Creative-text Translation and Technology

The use of machine translation is increasingly being explored for the translation of literary texts, but there is still a lot of uncertainty about the optimal translation workflow in these scenarios. While overall quality is quite good, certain textual characteristics can be different in a human translated text and a text produced by means of machine translation post-editing, which has been shown to potentially have an impact on reader perceptions and experience as well. In this study, we look at textual characteristics from short story translations from B.J. Novak’s One more thing into Dutch. Twenty-three professional literary translators translated three short stories, in three different conditions: using Word, using the classic CAT tool Trados, and using a machine translation post-editing platform specifically designed for literary translation. We look at overall text characteristics (sentence length, type-token ratio, stylistic differences) to establish whether translation workflow has an impact on these features, and whether the three workflows lead to very different final translations or not.

pdf bib abs
Machine Translation Meets Large Language Models: Evaluating ChatGPT’s Ability to Automatically Post-Edit Literary Texts
Lieve Macken
Proceedings of the 1st Workshop on Creative-text Translation and Technology

Large language models such as GPT-4 have been trained on vast corpora, giving them excellent language understanding. This study explores the use of ChatGPT for post-editing machine translations of literary texts. Three short stories, machine translated from English into Dutch, were post-edited by 7-8 professional translators and ChatGPT. Automatic metrics were used to evaluate the number and type of edits made, and semantic and syntactic similarity between the machine translation and the corresponding post-edited versions. A manual analysis classified errors in the machine translation and changes made by the post-editors. The results show that ChatGPT made more changes than the average post-editor. ChatGPT improved lexical richness over machine translation for all texts. The analysis of editing types showed that ChatGPT replaced more words with synonyms, corrected fewer machine errors and introduced more problems than professionals.

This project aims to develop a multilingual notification system for asylum reception centres in Belgium using machine translation. The system will allow staff to communicate practical messages to residents in their own language. Ethnographically inspired fieldwork is being conducted in reception centres to understand current communication practices and ensure that the technology meets user needs. The quality and suitability of machine translation will be evaluated for three MT systems supporting all target languages. Automatic and manual evaluation methods will be used to assess translation quality, and terms of use, privacy and data protection conditions will be analysed.

2023

pdf bib abs
Adapting Machine Translation Education to the Neural Era: A Case Study of MT Quality Assessment
Lieve Macken | Bram Vanroy | Arda Tezcan
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

The use of automatic evaluation metrics to assess Machine Translation (MT) quality is well established in the translation industry. Whereas it is relatively easy to cover the word- and character-based metrics in an MT course, it is less obvious to integrate the newer neural metrics. In this paper we discuss how we introduced the topic of MT quality assessment in a course for translation students. We selected three English source texts, each having a different difficulty level and style, and let the students translate the texts into their L1 and reflect upon translation difficulty. Afterwards, the students were asked to assess MT quality for the same texts using different methods and to critically reflect upon obtained results. The students had access to the MATEO web interface, which contains word- and character-based metrics as well as neural metrics. The students used two different reference translations: their own translations and professional translations of the three texts. We not only synthesise the comments of the students, but also present the results of some cross-lingual analyses on nine different language pairs.

pdf bib abs
Developing User-centred Approaches to Technological Innovation in Literary Translation (DUAL-T)
Paola Ruffo | Joke Daems | Lieve Macken
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

DUAL-T is an EU-funded project which aims at involving literary translators in the testing of technology-inclusive workflows. Participants will be asked to translate three short stories using, respectively, (1) a text editor combined with online resources, (2) a Computer-Aided Translation (CAT) tool, and (3) a Machine Translation Post-editing (MTPE) tool.

pdf bib abs
MATEO: MAchine Translation Evaluation Online
Bram Vanroy | Arda Tezcan | Lieve Macken
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

We present MAchine Translation Evaluation Online (MATEO), a project that aims to facilitate machine translation (MT) evaluation by means of an easy-to-use interface that can evaluate given machine translations with a battery of automatic metrics. It caters to both experienced and novice users who are working with MT, such as MT system builders, teachers and students of (machine) translation, and researchers.

2022

pdf bib abs
Literary translation as a three-stage process: machine translation, post-editing and revision
Lieve Macken | Bram Vanroy | Luca Desmet | Arda Tezcan
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

This study focuses on English-Dutch literary translations that were created in a professional environment using an MT-enhanced workflow consisting of a three-stage process of automatic translation followed by post-editing and (mainly) monolingual revision. We compare the three successive versions of the target texts. We used different automatic metrics to measure the (dis)similarity between the consecutive versions and analyzed the linguistic characteristics of the three translation variants. Additionally, on a subset of 200 segments, we manually annotated all errors in the machine translation output and classified the different editing actions that were carried out. The results show that more editing occurred during revision than during post-editing and that the types of editing actions were different.

pdf bib abs
Writing in a second Language with Machine translation (WiLMa)
Margot Fonteyne | Maribel Montero Perez | Joke Daems | Lieve Macken
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

The WiLMa project aims to assess the effects of using machine translation (MT) tools on the writing processes of second language (L2) learners of varying proficiency. Particular attention is given to individual variation in learners’ tool use.

pdf bib abs
GECO-MT: The Ghent Eye-tracking Corpus of Machine Translation
Toon Colman | Margot Fonteyne | Joke Daems | Nicolas Dirix | Lieve Macken
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In the present paper, we describe a large corpus of eye movement data, collected during natural reading of a human translation and a machine translation of a full novel. This data set, called GECO-MT (Ghent Eye tracking Corpus of Machine Translation) expands upon an earlier corpus called GECO (Ghent Eye-tracking Corpus) by Cop et al. (2017). The eye movement data in GECO-MT will be used in future research to investigate the effect of machine translation on the reading process and the effects of various error types on reading. In this article, we describe in detail the materials and data collection procedure of GECO-MT. Extensive information on the language proficiency of our participants is given, as well as a comparison with the participants of the original GECO. We investigate the distribution of a selection of important eye movement variables and explore the possibilities for future analyses of the data. GECO-MT is freely available at https://www.lt3.ugent.be/resources/geco-mt.

pdf bib abs
LeConTra: A Learner Corpus of English-to-Dutch News Translation
Bram Vanroy | Lieve Macken
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present LeConTra, a learner corpus consisting of English-to-Dutch news translations enriched with translation process data. Three students of a Master’s programme in Translation were asked to translate 50 different English journalistic texts of approximately 250 tokens each. Because we also collected translation process data in the form of keystroke logging, our dataset can be used as part of different research strands such as translation process research, learner corpus research, and corpus-based translation studies. Reference translations, without process data, are also included. The data has been manually segmented and tokenized, and manually aligned at both segment and word level, leading to a high-quality corpus with token-level process data. The data is freely accessible via the Translation Process Research DataBase, which emphasises our commitment of distributing our dataset. The tool that was built for manual sentence segmentation and tokenization, Mantis, is also available as an open-source aid for data processing.

2020

pdf bib abs
Assessing the Comprehensibility of Automatic Translations (ArisToCAT)
Lieve Macken | Margot Fonteyne | Arda Tezcan | Joke Daems
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

The ArisToCAT project aims to assess the comprehensibility of ‘raw’ (unedited) MT output for readers who can only rely on the MT output. In this project description, we summarize the main results of the project and present future work.

pdf bib abs
Literary Machine Translation under the Magnifying Glass: Assessing the Quality of an NMT-Translated Detective Novel on Document Level
Margot Fonteyne | Arda Tezcan | Lieve Macken
Proceedings of the Twelfth Language Resources and Evaluation Conference

Several studies (covering many language pairs and translation tasks) have demonstrated that translation quality has improved enormously since the emergence of neural machine translation systems. This raises the question whether such systems are able to produce high-quality translations for more creative text types such as literature and whether they are able to generate coherent translations on document level. Our study aimed to investigate these two questions by carrying out a document-level evaluation of the raw NMT output of an entire novel. We translated Agatha Christie’s novel The Mysterious Affair at Styles with Google’s NMT system from English into Dutch and annotated it in two steps: first all fluency errors, then all accuracy errors. We report on the overall quality, determine the remaining issues, compare the most frequent error types to those in general-domain MT, and investigate whether any accuracy and fluency errors co-occur regularly. Additionally, we assess the inter-annotator agreement on the first chapter of the novel.

2019

pdf bib
Modelling word translation entropy and syntactic equivalence with machine learning
Bram Vanroy | Orphée De Clercq | Lieve Macken
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production

pdf bib
When a ‘sport’ is a person and other issues for NMT of novels
Arda Tezcan | Joke Daems | Lieve Macken
Proceedings of the Qualities of Literary Machine Translation

2018

We present the highlights of the now finished 4-year SCATE project. It was completed in February 2018 and funded by the Flemish Government IWT-SBO, project No. 130041.1

pdf bib
A fine-grained error analysis of NMT, SMT and RBMT output for English-to-Dutch
Laura Van Brussel | Arda Tezcan | Lieve Macken
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

In order to improve the symbiosis between machine translation (MT) system and post-editor, it is not enough to know that the output of one system is better than the output of another system. A fine-grained error analysis is needed to provide information on the type and location of errors occurring in MT and the corresponding errors occurring after post-editing (PE). This article reports on a fine-grained translation quality assessment approach which was applied to machine translated-texts and the post-edited versions of these texts, made by student post-editors. By linking each error to the corresponding source text-passage, it is possible to identify passages that were problematic in MT, but not after PE, or passages that were problematic even after PE. This method provides rich data on the origin and impact of errors, which can be used to improve post-editor training as well as machine translation systems. We present the results of a pilot experiment on the post-editing of newspaper articles and highlight the advantages of our approach.

2013

pdf bib
Quality as the sum of its parts: a two-step approach for the identification of translation problems and translation quality assessment for HT and MT+PE
Joke Daems | Lieve Macken | Sonia Vandepitte
Proceedings of the 2nd Workshop on Post-editing Technology and Practice

2012

pdf bib abs
From keystrokes to annotated process data: Enriching the output of Inputlog with linguistic information
Lieve Macken | Veronique Hoste | Mariëlle Leijten | Luuk Van Waes
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Keystroke logging tools are a valuable aid to monitor written language production. These tools record all keystrokes, including backspaces and deletions together with timing information. In this paper we report on an extension to the keystroke logging program Inputlog in which we aggregate the logged process data from the keystroke (character) level to the word level. The logged process data are further enriched with different kinds of linguistic information: part-of-speech tags, lemmata, chunk boundaries, syllable boundaries and word frequency. A dedicated parser has been developed that distils from the logged process data word-level revisions, deleted fragments and final product data. The linguistically-annotated output will facilitate the linguistic analysis of the logged data and will provide a valuable basis for more linguistically-oriented writing process research. The set-up of the extension to Inputlog is largely language-independent. As proof-of-concept, the extension has been developed for English and Dutch. Inputlog is freely available for research purposes.

pdf bib
From Character to Word Level: Enabling the Linguistic Analyses of Inputlog Process Data
Mariëlle Leijten | Lieve Macken | Veronique Hoste | Eric Van Horenbeeck | Luuk Van Waes
Proceedings of the Second Workshop on Computational Linguistics and Writing (CL&W 2012): Linguistic and Cognitive Aspects of Document Creation and Document Engineering

2010

pdf bib abs
An Annotation Scheme and Gold Standard for Dutch-English Word Alignment
Lieve Macken
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The importance of sentence-aligned parallel corpora has been widely acknowledged. Reference corpora in which sub-sentential translational correspondences are indicated manually are more labour-intensive to create, and hence less wide-spread. Such manually created reference alignments -- also called Gold Standards -- have been used in research projects to develop or test automatic word alignment systems. In most translations, translational correspondences are rather complex; for example word-by-word correspondences can be found only for a limited number of words. A reference corpus in which those complex translational correspondences are aligned manually is therefore also a useful resource for the development of translation tools and for translation studies. In this paper, we describe how we created a Gold Standard for the Dutch-English language pair. We present the annotation scheme, annotation guidelines, annotation tool and inter-annotator results. To cover a wide range of syntactic and stylistic phenomena that emerge from different writing and translation styles, our Gold Standard data set contains texts from different text types. The Gold Standard will be publicly available as part of the Dutch Parallel Corpus.

2009

pdf bib
Language-Independent Bilingual Terminology Extraction from a Multilingual Parallel Corpus
Els Lefever | Lieve Macken | Veronique Hoste
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
Linguistically-Based Sub-Sentential Alignment for Terminology Extraction from a Bilingual Automotive Corpus
Lieve Macken | Els Lefever | Veronique Hoste
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib abs
Sentence Alignment in DPC: Maximizing Precision, Minimizing Human Effort
Julia Trushkina | Lieve Macken | Hans Paulussen
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

A wide spectrum of multilingual applications have aligned parallel corpora as their prerequisite. The aim of the project described in this paper is to build a multilingual corpus where all sentences are aligned at very high precision with a minimal human effort involved. The experiments on a combination of sentence aligners with different underlying algorithms described in this paper showed that by verifying only those links which were not recognized by at least two aligners, an error rate can be reduced by 93.76% as compared to the performance of the best aligner. Such manual involvement concerned only a small portion of all data (6%). This significantly reduces a load of manual work necessary to achieve nearly 100% accuracy of alignment.