Lise Volkart


2024

pdf bib
Post-editors as Gatekeepers of Lexical and Syntactic Diversity: Comparative Analysis of Human Translation and Post-editing in Professional Settings
Lise Volkart | Pierrette Bouillon
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)

This paper presents a comparative analysis between human translation (HT) and post-edited machine translation (PEMT) from a lexical and syntactic perspective to verify whether the tendency of neural machine translation (NMT) systems to produce lexically and syntactically poorer translations shines through after post-editing (PE). The analysis focuses on three datasets collected in professional contexts containing translations from English into French and German into French. Through a comparison of word translation entropy (HTRa) scores, we observe a lower degree of lexical diversity in PEMT compared to HT. Additionally, metrics of syntactic equivalence indicate that PEMT is more likely to mirror the syntactic structure of the source text in contrast to HT. By incorporating raw machine translation (MT) output into our analysis, we underline the important role post-editors play in adding lexical and syntactic diversity to MT output. Our findings provide relevant input for MT users and decision-makers in language services as well as for MT and PE trainers and advisers.

2022

pdf bib
Studying Post-Editese in a Professional Context: A Pilot Study
Lise Volkart | Pierrette Bouillon
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

The past few years have seen the multiplication of studies on post-editese, following the massive adoption of post-editing in professional translation workflows. These studies mainly rely on the comparison of post-edited machine translation and human translation on artificial parallel corpora. By contrast, we investigate here post-editese on comparable corpora of authentic translation jobs for the language direction English into French. We explore commonly used scores and also proposes the use of a novel metric. Our analysis shows that post-edited machine translation is not only lexically poorer than human translation, but also less dense and less varied in terms of translation solutions. It also tends to be more prolific than human translation for our language direction. Finally, our study highlights some of the challenges of working with comparable corpora in post-editese research.

2020

pdf bib
Re-design of the Machine Translation Training Tool (MT3)
Paula Estrella | Emiliano Cuenca | Laura Bruno | Jonathan Mutal | Sabrina Girletti | Lise Volkart | Pierrette Bouillon
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

We believe that machine translation (MT) must be introduced to translation students as part of their training, in preparation for their professional life. In this paper we present a new version of the tool called MT3, which builds on and extends a joint effort undertaken by the Faculty of Languages of the University of Córdoba and Faculty of Translation and Interpreting of the University of Geneva to develop an open-source web platform to teach MT to translation students. We also report on a pilot experiment with the goal of testing the viability of using MT3 in an MT course. The pilot let us identify areas for improvement and collect students’ feedback about the tool’s usability.

2019

pdf bib
Differences between SMT and NMT Output - a Translators’ Point of View
Jonathan Mutal | Lise Volkart | Pierrette Bouillon | Sabrina Girletti | Paula Estrella
Proceedings of the Human-Informed Translation and Interpreting Technology Workshop (HiT-IT 2019)

In this study, we compare the output quality of two MT systems, a statistical (SMT) and a neural (NMT) engine, customised for Swiss Post’s Language Service using the same training data. We focus on the point of view of professional translators and investigate how they perceive the differences between the MT output and a human reference (namely deletions, substitutions, insertions and word order). Our findings show that translators more frequently consider these differences to be errors in SMT than NMT, and that deletions are the most serious errors in both architectures. We also observe lower agreement on differences to be corrected in NMT than in SMT, suggesting that errors are easier to identify in SMT. These findings confirm the ability of NMT to produce correct paraphrases, which could also explain why BLEU is often considered as an inadequate metric to evaluate the performance of NMT systems.