Maarit Koponen

2025

Generative AI for Technical Writing: Comparing Human and LLM Assessments of Generated Content
Karen de Souza | Alexandre Nikolaev | Maarit Koponen
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)

Large language models (LLMs) have recently gained significant attention for their capabilities in natural language processing (NLP), particularly generative artificial intelligence (AI). LLMs can also be useful tools for software documentation technical writers. We present an assessment of technical documentation content generated by three different LLMs using retrieval-augmented technology (RAG) with product documentation as a knowledge base. The LLM-generated responses were analyzed in three ways: 1) manual error analysis by a technical writer, 2) automatic assessment using deterministic metrics (BLEU, ROUGE, token overlap), and 3) evaluation of correctness by LLM as a judge. The results of these assessments were compared using a Network Analysis and linear regression models to investigate statistical relationships, model preferences, and the distribution of human and LLM scores. The analyses concluded that human quality evaluation is more related to the LLM correctness judgment than deterministic metrics, even when using different analysis frameworks.

pdf bib abs

Machine translation as support for epistemic capacities: Findings from the DECA project
Maarit Koponen | Nina Havumetsä | Juha Lång | Mary Nurminen
Proceedings of Machine Translation Summit XX: Volume 2

The DECA project consortium investigates epistemic capacities, defined as an individual’s access to reliable knowledge, their ability to participate in knowledge production, and society’s capacity to make informed, sustainable policy decisions. As a tool both for accessing information across language barriers and for producing multilingual information, machine translation also plays a potential role in supporting these epistemic capacities. In this paper, we present an overview of DECA’s research on two perspectives: 1) how migrants use machine translation to access information, and 2) how journalists use machine translation in their work.

pdf bib

Proceedings of the 2nd LUHME Workshop
Henrique Lopes Cardoso | Rui Sousa-Silva | Maarit Koponen | Antonio Pareja-Lora
Proceedings of the 2nd LUHME Workshop

2024

pdf bib abs

Effects of different types of noise in user-generated reviews on human and machine translations including ChatGPT
Maja Popovic | Ekaterina Lapshinova-Koltunski | Maarit Koponen
Proceedings of the Ninth Workshop on Noisy and User-generated Text (W-NUT 2024)

This paper investigates effects of noisy source texts (containing spelling and grammar errors, informal words or expressions, etc.) on human and machine translations, namely whether the noisy phenomena are kept in the translations, corrected, or caused errors. The analysed data consists of English user reviews of Amazon products translated into Croatian, Russian and Finnish by professional translators, translation students, machine translation (MT) systems, and ChatGPT language model. The results show that overall, ChatGPT and professional translators mostly correct/standardise those parts, while students are often keeping them. Furthermore, MT systems are most prone to errors while ChatGPT is more robust, but notably less robust than human translators. Finally, some of the phenomena are particularly challenging both for MT systems and for ChatGPT, especially spelling errors and informal constructions.

pdf bib

Proceedings of the 1st LUHME Workshop
Rui Sousa-Silva | Henrique Lopes Cardoso | Maarit Koponen | Antonio Pareja Lora | Márta Seresi
Proceedings of the 1st LUHME Workshop

2023

pdf bib abs

DECA: Democratic epistemic capacities in the age of algorithms
Maarit Koponen | Mary Nurminen | Nina Havumetsä | Juha Lång
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

The DECA project consortium investigates epistemic capacities, defined as an individual’s access to reliable knowledge, their ability to participate in knowledge production, and society’s capacity to make informed, sustainable policy decisions. In this paper, we focus specifically on the parts of the project examining the challenges posed by multilinguality in these processes and the potential role of MT in supporting access to, and production of, knowledge.

pdf bib abs

Computational analysis of different translations: by professionals, students and machines
Maja Popovic | Ekaterina Lapshinova-Koltunski | Maarit Koponen
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

In this work, we analyse different translated texts in terms of various text features. We compare two types of human translations, professional and students’, and machine translation outputs in terms of lexical and grammatical variety, sentence length,as well as frequencies of different POS tags and POS-trigrams. Our experimentsare carried out on parallel translations into three languages, Croatian, Finnish andRussian, all originating from the same source English texts. Our results indicatethat machine translations are closest to the source text, followed by student translations. Also, student translations are similar both to professional as well as to MT, sometimes even more to MT. Furthermore, we identify sets of features which are convenient for distinguishing machine from human translations.

pdf bib abs

Do Humans Translate like Machines? Students’ Conceptualisations of Human and Machine Translation
Salmi Leena | Aletta G. Dorst | Maarit Koponen | Katinka Zeven
Proceedings of the 24th Annual Conference of the European Association for Machine Translation

This paper explores how students conceptualise the processes involved in human and machine translation, and how they describe the similarities and differences between them. The paper presents the results of a survey involving university students (B.A. and M.A.) taking a course on translation who filled out an online questionnaire distributed in Finnish, Dutch and English. Our study finds that students often describe both human translation and machine translation in similar terms, suggesting they do not sufficiently distinguish between them and do not fully understand how machine translation works. The current study suggests that training in Machine Translation Literacy may need to focus more on the conceptualisations involved and how conceptual and vernacular misconceptions may affect how translators understand human and machine translation.

2022

pdf bib abs

DiHuTra: a Parallel Corpus to Analyse Differences between Human Translations
Ekaterina Lapshinova-Koltunski | Maja Popović | Maarit Koponen
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper describes a new corpus of human translations which contains both professional and students translations. The data consists of English sources – texts from news and reviews – and their translations into Russian and Croatian, as well as of the subcorpus containing translations of the review texts into Finnish. All target languages represent mid-resourced and less or mid-investigated ones. The corpus will be valuable for studying variation in translation as it allows a direct comparison between human translations of the same source texts. The corpus will also be a valuable resource for evaluating machine translation systems. We believe that this resource will facilitate understanding and improvement of the quality issues in both human and machine translation. In the paper, we describe how the data was collected, provide information on translator groups and summarise the differences between the human translations at hand based on our preliminary results with shallow features.

pdf bib abs

LITHME: Language in the Human-Machine Era
Maarit Koponen | Kais Allkivi-Metsoja | Antonio Pareja-Lora | Dave Sayers | Márta Seresi
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

The LITHME COST Action brings together researchers from various fields of study focusing on language and technology. We present the overall goals of LITHME and the network’s working groups focusing on diverse questions related to language and technology. As an example of the work of the LITHME network, we discuss the working group on language work and language professionals.

pdf bib abs

DiHuTra: a Parallel Corpus to Analyse Differences between Human Translations
Ekaterina Lapshinova-Koltunski | Maja Popović | Maarit Koponen
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

This project aimed to design a corpus of parallel human translations (HTs) of the same source texts by professionals and students. The resulting corpus consists of English news and reviews source texts, their translations into Russian and Croatian, and translations of the reviews into Finnish. The corpus will be valuable for both studying variation in translation and evaluating machine translation (MT) systems.

2020

pdf bib abs

MT for subtitling: User evaluation of post-editing productivity
Maarit Koponen | Umut Sulubacak | Kaisa Vitikainen | Jörg Tiedemann
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

This paper presents a user evaluation of machine translation and post-editing for TV subtitles. Based on a process study where 12 professional subtitlers translated and post-edited subtitles, we compare effort in terms of task time and number of keystrokes. We also discuss examples of specific subtitling features like condensation, and how these features may have affected the post-editing results. In addition to overall MT quality, segmentation and timing of the subtitles are found to be important issues to be addressed in future work.

pdf bib

MT for Subtitling: Investigating professional translators’ user experience and feedback
Maarit Koponen | Umut Sulubacak | Kaisa Vitikainen | Jörg Tiedemann
Proceedings of 1st Workshop on Post-Editing in Modern-Day Translation

2018

pdf bib abs

Progress in the quality of machine translation output calls for new automatic evaluation procedures and metrics. In this paper, we extend the Morpheval protocol introduced by Burlot and Yvon (2017) for the English-to-Czech and English-to-Latvian translation directions to three additional language pairs, and report its use to analyze the results of WMT 2018’s participants for these language pairs. Considering additional, typologically varied source and target languages also enables us to draw some generalizations regarding this morphology-oriented evaluation procedure.

2015

pdf bib

How to teach machine translation post-editing? Experiences from a post-editing course
Maarit Koponen
Proceedings of the 4th Workshop on Post-editing Technology and Practice

2013

pdf bib

This translation is not too bad: an analysis of post-editor choices in a machine-translation post-editing task
Maarit Koponen
Proceedings of the 2nd Workshop on Post-editing Technology and Practice

2012

pdf bib

Comparing human perceptions of post-editing effort with post-editing operations
Maarit Koponen
Proceedings of the Seventh Workshop on Statistical Machine Translation

pdf bib abs

Post-editing time as a measure of cognitive effort
Maarit Koponen | Wilker Aziz | Luciana Ramos | Lucia Specia
Workshop on Post-Editing Technology and Practice

Post-editing machine translations has been attracting increasing attention both as a common practice within the translation industry and as a way to evaluate Machine Translation (MT) quality via edit distance metrics between the MT and its post-edited version. Commonly used metrics such as HTER are limited in that they cannot fully capture the effort required for post-editing. Particularly, the cognitive effort required may vary for different types of errors and may also depend on the context. We suggest post-editing time as a way to assess some of the cognitive effort involved in post-editing. This paper presents two experiments investigating the connection between post-editing time and cognitive effort. First, we examine whether sentences with long and short post-editing times involve edits of different levels of difficulty. Second, we study the variability in post-editing time and other statistics among editors.

Co-authors

Venues

WMT2