Michael Carl


2024

pdf bib
Impact of Syntactic Complexity on the Processes and Performance of Large Language Models-leveraged Post-editing
Longhui Zou | Michael Carl | Shaghayegh Momtaz | Mehdi Mirzapour
Proceedings of the 16th Conference of the Association for Machine Translation in the Americas (Volume 2: Presentations)

This research explores the interaction between human translators and Large Language Models (LLMs) during post-editing (PE). The study examines the impact of syntactic complexity on the PE processes and performance, specifically when working with the raw translation output generated by GPT-4. We selected four English source texts (STs) from previous American Translators Association (ATA) certification examinations. Each text is about 10 segments, with 250 words. GPT-4 was employed to translate the four STs from English into simplified Chinese. The empirical experiment simulated the authentic work environment of PE, using professional computer-assisted translation (CAT) tool, Trados. The experiment involved 46 participants with different levels of translation expertise (30 student translators and 16 expert translators), producing altogether 2162 segments of PE versions. We implemented five syntactic complexity metrics in the context of PE for quantitative analysis.

pdf bib
Using Machine Learning to Validate a Novel Taxonomy of Phenomenal Translation States
Michael Carl | Sheng Lu | Ali Al-Ramadan
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)

We report an experiment in which we use machine learning to validate the empirical objectivity of a novel annotation taxonomy for behavioral translation data. The HOF taxonomy defines three translation states according to which a human translator can be in a state of Orientation (O), Hesitation (H) or in a Flow state (F). We aim at validating the taxonomy based on a manually annotated dataset that consists of six English-Spanish translation sessions (approx 900 words) and 1813 HOF-annotated Activity Units (AUs). Two annotators annotated the data and obtain high average inter-annotator accuracy 0.76 (kappa 0.88). We train two classifiers, a Multi-layer Perceptron (MLP) and a Random Forest (RF) on the annotated data and tested on held-out data. The classifiers perform well on the annotated data and thus confirm the epistemological objectivity of the annotation taxonomy. Interestingly, inter-classifier accuracy scores are higher than between the two human annotators.

2022

pdf bib
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Workshop 1: Empirical Translation Process Research)
Michael Carl | Masaru Yamada | Longui Zou
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Workshop 1: Empirical Translation Process Research)

pdf bib
Investigating the Impact of Different Pivot Languages on Translation Quality
Longhui Zou | Ali Saeedi | Michael Carl
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Workshop 1: Empirical Translation Process Research)

Translating via an intermediate pivot language is a common practice, but the impact of the pivot language on the quality of the final translation has not often been investigated. In order to compare the effect of different pivots, we back-translate 41 English source segments via vari- ous intermediate channels (Arabic, Chinese and monolingual paraphrasing) into English. We compare the 912 English back-translations of the 41 original English segments using manual evaluation, as well as COMET and various incarnations of BLEU. We compare human from- scratch back-translations with MT back-translations and monolingual paraphrasing. A varia- tion of BLEU (Cum-2) seems to better correlate with our manual evaluation than COMET and the conventional BLEU Cum-4, but a fine-grained qualitative analysis reveals that differences between different pivot languages (Arabic and Chinese) are not captured by the automatized TQA measures.

pdf bib
Proficiency and External Aides: Impact of Translation Brief and Search Conditions on Post-editing Quality
Longhui Zou | Michael Carl | Masaru Yamada | Takanori Mizowaki
Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Workshop 1: Empirical Translation Process Research)

This study investigates the impact of translation briefs and search conditions on post-editing (PE) quality produced by participants with different levels of translation proficiency. We hired five Chinese student translators and seven Japanese professional translators to conduct full post-editing (FPE) and light post-editing (LPE), as described in the translation brief, while controlling two search conditions i.e., usage of a termbase (TB) and internet search (IS). Our results show that FPE versions of the final translations tend to have less errors than LPE ver- sions. The FPE translation brief improves participants’ performance on fluency as compared to LPE, whereas the search condition of TB helps to improve participants’ performance on accuracy as compared to IS. Our findings also indicate that the occurrences of fluency errors produced by experienced translators (i.e., the Japanese participants) are more in line with the specifications addressed in translation briefs, whereas the occurrences of accuracy errors pro- duced by inexperienced translators (i.e., our Chinese participants) depend more on the search conditions.

pdf bib
Using Translation Process Data to Explore Explicitation and Implicitation through Discourse Connectives
Ekaterina Lapshinova-Koltunski | Michael Carl
Proceedings of the 3rd Workshop on Computational Approaches to Discourse

We look into English-German translation process data to analyse explicitation and implicitation phenomena of discourse connectives. For this, we use the database CRITT TPR-DB which contains translation process data with various features that elicit online translation behaviour. We explore the English-German part of the data for discourse connectives that are either omitted or inserted in the target, as well as cases when changing a weak signal to strong one, or the other way around. We determine several features that have an impact on cognitive effort during translation for explicitation and implicitation. Our results show that cognitive load caused by implicitation and explicitation may depend on the discourse connectives used, as well as on the strength and the type of the relations the connectives convey.

pdf bib
Trados-to-Translog-II: Adding Gaze and Qualitivity data to the CRITT TPR-DB
Masaru Yamada | Takanori Mizowaki | Longhui Zou | Michael Carl
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation

The CRITT (Center for Research and Innovation in Translation and Translation Technology) provides a Translation Process Research Database (TPR-DB) and a rich set of summary tables and tools that help to investigate translator behavior. In this paper, we describe a new tool in the TPR-DB that converts Trados Studio keylogging data (Qualitivity) into Translog-II format and adds the converted data to the CRITT TPR-DB. The tool is also able to synchronize with the output of various eye-trackers. We describe the components of the new TPR-DB tool and highlight some of the features that it produces in the TPR-DB tables.

2021

pdf bib
Word Alignment Dissimilarity Indicator: Alignment Links as Conceptualizations of a Focused Bilingual Lexicon
Devin Gilbert | Michael Carl
Proceedings for the First Workshop on Modelling Translation: Translatology in the Digital Age

2019

pdf bib
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production
Michael Carl | Silvia Hansen-Schirra
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production

pdf bib
Lexical Representation & Retrieval on Monolingual Interpretative text production
Debasish Sahoo | Michael Carl
Proceedings of the Second MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production

2018

pdf bib
Literality and cognitive effort: Japanese and Spanish
Isabel Lacruz | Michael Carl | Masaru Yamada
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
A Minimal Cognitive Model for Translating and Post-editing
Moritz Schaeffer | Michael Carl
Proceedings of Machine Translation Summit XVI: Research Track

pdf bib
Experiments in Non-Coherent Post-editing
Cristina Toledo Báez | Moritz Schaeffer | Michael Carl
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology

Market pressure on translation productivity joined with technological innovation is likely to fragment and decontextualise translation jobs even more than is cur-rently the case. Many different translators increasingly work on one document at different places, collaboratively working in the cloud. This paper investigates the effect of decontextualised source texts on behaviour by comparing post-editing of sequentially ordered sentences with shuffled sentences from two different texts. The findings suggest that there is little or no effect of the decontextualised source texts on behaviour.

2016

pdf bib
Measuring Cognitive Translation Effort with Activity Units
Moritz Jonas Schaeffer | Michael Carl | Isabel Lacruz | Akiko Aizawa
Proceedings of the 19th Annual Conference of the European Association for Machine Translation

pdf bib
English-to-Japanese Translation vs. Dictation vs. Post-editing: Comparing Translation Modes in a Multilingual Setting
Michael Carl | Akiko Aizawa | Masaru Yamada
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Speech-enabled interfaces have the potential to become one of the most efficient and ergonomic environments for human-computer interaction and for text production. However, not much research has been carried out to investigate in detail the processes and strategies involved in the different modes of text production. This paper introduces and evaluates a corpus of more than 55 hours of English-to-Japanese user activity data that were collected within the ENJA15 project, in which translators were observed while writing and speaking translations (translation dictation) and during machine translation post-editing. The transcription of the spoken data, keyboard logging and eye-tracking data were recorded with Translog-II, post-processed and integrated into the CRITT Translation Process Research-DB (TPR-DB), which is publicly available under a creative commons license. The paper presents the ENJA15 data as part of a large multilingual Chinese, Danish, German, Hindi and Spanish translation process data collection of more than 760 translation sessions. It compares the ENJA15 data with the other language pairs and reviews some of its particularities.

2014

pdf bib
CFT13: A resource for research into the post-editing process
Michael Carl | Mercedes Martínez García | Bartolomé Mesa-Lao
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes the most recent dataset that has been added to the CRITT Translation Process Research Database (TPR-DB). Under the name CFT13, this new study contains user activity data (UAD) in the form of key-logging and eye-tracking collected during the second CasMaCat field trial in June 2013. The CFT13 is a publicly available resource featuring a number of simple and compound process and product units suited to investigate human-computer interaction while post-editing machine translation outputs.

pdf bib
Evaluating the effects of interactivity in a post-editing workbench
Nancy Underwood | Bartolomé Mesa-Lao | Mercedes García Martínez | Michael Carl | Vicent Alabau | Jesús González-Rubio | Luis A. Leiva | Germán Sanchis-Trilles | Daniel Ortíz-Martínez | Francisco Casacuberta
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes the field trial and subsequent evaluation of a post-editing workbench which is currently under development in the EU-funded CasMaCat project. Based on user evaluations of the initial prototype of the workbench, this second prototype of the workbench includes a number of interactive features designed to improve productivity and user satisfaction. Using CasMaCat’s own facilities for logging keystrokes and eye tracking, data were collected from nine post-editors in a professional setting. These data were then used to investigate the effects of the interactive features on productivity, quality, user satisfaction and cognitive load as reflected in the post-editors’ gaze activity. These quantitative results are combined with the qualitative results derived from user questionnaires and interviews conducted with all the participants.

pdf bib
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation
Ulrich Germann | Michael Carl | Philipp Koehn | Germán Sanchis-Trilles | Francisco Casacuberta | Robin Hill | Sharon O’Brien
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation

pdf bib
Measuring the Cognitive Effort of Literal Translation Processes
Moritz Schaeffer | Michael Carl
Proceedings of the EACL 2014 Workshop on Humans and Computer-assisted Translation

pdf bib
CASMACAT: A Computer-assisted Translation Workbench
Vicent Alabau | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Ulrich Germann | Jesús González-Rubio | Robin Hill | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Ortiz-Martínez | Herve Saint-Amand | Germán Sanchis Trilles | Chara Tsoukala
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
CASMACAT: cognitive analysis and statistical methods for advanced computer aided translation
Philipp Koehn | Michael Carl | Francisco Casacuberta | Eva Marcos
Proceedings of the 17th Annual Conference of the European Association for Machine Translation

pdf bib
SEECAT: ASR & Eye-tracking enabled computer-assisted translation
Mercedes García-Martínez | Karan Singla | Aniruddha Tammewar | Bartolomé Mesa-Lao | Ankita Thakur | Anusuya M.A. | Srinivas Bangalore | Michael Carl
Proceedings of the 17th Annual Conference of the European Association for Machine Translation

pdf bib
Integrating online and active learning in a computer-assisted translation workbench
Vicent Alabau | Jesús González-Rubio | Daniel Ortiz-Martínez | Germán Sanchis-Trilles | Francisco Casacuberta | Mercedes García-Martínez | Bartolomé Mesa-Lao | Dan Cheung Petersen | Barbara Dragsted | Michael Carl
Workshop on interactive and adaptive machine translation

This paper describes a pilot study with a computed-assisted translation workbench aiming at testing the integration of online and active learning features. We investigate the effect of these features on translation productivity, using interactive translation prediction (ITP) as a baseline. User activity data were collected from five beta testers using key-logging and eye-tracking. User feedback was also collected at the end of the experiments in the form of retrospective think-aloud protocols. We found that OL performs better than ITP, especially in terms of translation speed. In addition, AL provides better translation quality than ITP for the same levels of user effort. We plan to incorporate these features in the final version of the workbench.

pdf bib
Predicting post-editor profiles from the translation process
Karan Singla | David Orrego-Carmona | Ashleigh Rhea Gonzales | Michael Carl | Srinivas Bangalore
Workshop on interactive and adaptive machine translation

The purpose of the current investigation is to predict post-editor profiles based on user behaviour and demographics using machine learning techniques to gain a better understanding of post-editor styles. Our study extracts process unit features from the CasMaCat LS14 database from the CRITT Translation Process Research Database (TPR-DB). The analysis has two main research goals: We create n-gram models based on user activity and part-of-speech sequences to automatically cluster post-editors, and we use discriminative classifier models to characterize post-editors based on a diverse range of translation process features. The classification and clustering of participants resulting from our study suggest this type of exploration could be used as a tool to develop new translation tool features or customization possibilities.

2013

pdf bib
Automatically Predicting Sentence Translation Difficulty
Abhijit Mishra | Pushpak Bhattacharyya | Michael Carl
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
User Evaluation of Advanced Interaction Features for a Computer-Assisted Translation Workbench
Vicente Alabau | Jesus Gonzalez-Rubio | Luis A. Leiva | Daniel Ortiz-Martínez | German Sanchis-Trilles | Francisco Casacuberta | Bartolomé Mesa-Lao | Ragnar Bonk | Michael Carl | Mercedes Garcia-Martinez
Proceedings of Machine Translation Summit XIV: User track

pdf bib
CASMACAT: Cognitive Analysis and Statistical Methods for Advanced Computer Aided Translation
Philipp Koehn | Michael Carl | Francisco Casacuberta | Eva Marcos
Proceedings of Machine Translation Summit XIV: European projects

pdf bib
Advanced computer aided translation with a web-based workbench
Vicent Alabau | Ragnar Bonk | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Jesús González | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Oriz | Hervé Saint-Amand | Germán Sanchis | Chara Tsiukala
Proceedings of the 2nd Workshop on Post-editing Technology and Practice

2012

pdf bib
Proceedings of the First Workshop on Eye-tracking and Natural Language Processing
Michael Carl | Pushpak Bhattacharyya | Kamal Kumar Choudhary
Proceedings of the First Workshop on Eye-tracking and Natural Language Processing

pdf bib
A heuristic-based approach for systematic error correction of gaze data for reading
Abhijit Mishra | Michael Carl | Pushpak Bhattacharyya
Proceedings of the First Workshop on Eye-tracking and Natural Language Processing

pdf bib
Translog-II: a Program for Recording User Activity Data for Empirical Reading and Writing Research
Michael Carl
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper presents a novel implementation of Translog-II. Translog-II is a Windows-oriented program to record and study reading and writing processes on a computer. In our research, it is an instrument to acquire objective, digital data of human translation processes. As their predecessors, Translog 2000 and Translog 2006, also Translog-II consists of two main components: Translog-II Supervisor and Translog-II User, which are used to create a project file, to run a text production experiments (a user reads, writes or translates a text) and to replay the session. Translog produces a log files which contains all user activity data of the reading, writing, or translation session, and which can be evaluated by external tools. While there is a large body of translation process research based on Translog, this paper gives an overview of the Translog-II functions and its data visualization options.

pdf bib
The CRITT TPR-DB 1.0: A Database for Empirical Human Translation Process Research
Michael Carl
Workshop on Post-Editing Technology and Practice

This paper introduces a publicly available database of recorded translation sessions for Translation Process Research (TPR). User activity data (UAD) of translators behavior was collected over the past 5 years in several translation studies with Translog 1 , a data acquisition software which logs keystrokes and gaze data during text reception and production. The database compiles this data into a consistent format which can be processed by various visualization and analysis tools.

2010

pdf bib
Correlating Translation Product and Translation Process Data of Professional and Student Translators
Michael Carl | Matthias Buch-Kromann
Proceedings of the 14th Annual Conference of the European Association for Machine Translation

pdf bib
A computational framework for a cognitive model of human translation processes
Michael Carl
Proceedings of Translating and the Computer 32

2009

pdf bib
Grounding Translation Tools in Translator’s Activity Data
Michael Carl
Beyond Translation Memories: New Tools for Translators Workshop

2008

pdf bib
Modelling human translator behaviour with user-activity data
Michael Carl | Arnt Lykke Jakobsen | Kristian T.H. Jensen
Proceedings of the 12th Annual Conference of the European Association for Machine Translation

pdf bib
Using Log-linear Models for Tuning Machine Translation Output
Michael Carl
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We describe a set of experiments to explore statistical techniques for ranking and selecting the best translations in a graph of translation hypotheses. In a previous paper (Carl, 2007) we have described how the graph of hypotheses is generated through shallow transfer and chunk permutation rules, where nodes consist of vectors representing morpho-syntactic properties of words and phrases. This paper describes a number of methods to train statistical feature functions from some of the vector’s components. The feature functions are trained off-line on different types of text and their log-linear combination is then used to retrieve the best translation paths in the graph. We compare two language modelling toolkits, the CMU and the SRI toolkit and arrive at three results: 1) models of lemma-based feature functions produce better results than token-based models, 2) adding PoS-tag feature function to the lemma models improves the output and 3) weights for lexical translations are suited if the training material is similar to the texts to be translated.

pdf bib
Evaluation of a Machine Translation System for Low Resource Languages: METIS-II
Vincent Vandeghinste | Peter Dirix | Ineke Schuurman | Stella Markantonatou | Sokratis Sofianopoulos | Marina Vassiliou | Olga Yannoutsou | Toni Badia | Maite Melero | Gemma Boleda | Michael Carl | Paul Schmidt
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we describe the METIS-II system and its evaluation on each of the language pairs: Dutch, German, Greek, and Spanish to English. The METIS-II system envisaged developing a data-driven approach in which no parallel corpus is required, and in which no full parser or extensive rule sets are needed. We describe evalution on a development test set and on a test set coming from Europarl, and compare our results with SYSTRAN. We also provide some further analysis, researching the impact of the number and source of the reference translations and analysing the results according to test text type. The results are expectably lower for the METIS system, but not at an unatainable distance from a mature system like SYSTRAN.

2007

pdf bib
METIS-II: the German to English MT system
Michael Carl
Proceedings of Machine Translation Summit XI: Papers

pdf bib
Demonstration of the German to English METIS-II MT system
Michael Carl | Sandrine Garnier | Paul Schmidt
Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers

2006

pdf bib
METIS-II: Machine Translation for Low Resource Languages
Vincent Vandeghinste | Ineke Schuurman | Michael Carl | Stella Markantonatou | Toni Badia
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we describe a machine translation prototype in which we use only minimal resources for both the source and the target language. A shallow source language analysis, combined with a translation dictionary and a mapping system of source language phenomena into the target language and a target language corpus for generation are all the resources needed in the described system. Several approaches are presented.

pdf bib
A Dictionary Lookup Strategy for Translating of Discontinuous Phrases
Michael Carl | Ecaterina Rascu
Proceedings of the 11th Annual Conference of the European Association for Machine Translation

2005

pdf bib
Using template-grammars for shake & bake paraphrasing
Michael Carl | Ecaterina Rascu | Paul Schmidt
Proceedings of the 10th EAMT Conference: Practical applications of machine translation

2004

pdf bib
Experimenting with phrase-based statistical translation within the IWSLT Chinese-to-English shared translation task
Philippe Langlais | Michael Carl | Oliver Streiter
Proceedings of the First International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
Controlling Gender Equality with Shallow NLP Techniques
Michael Carl | Sandrine Garnier | Johann Haller | Anne Altmayer | Bärbel Miemietz
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Using Weighted Abduction to Align Term Variant Translations in Bilingual Texts
Michael Carl | Ecaterina Rascu | Johann Haller
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
Data-assisted controlled translation
Michael Carl
EAMT Workshop: Improving MT through other language technology tools: resources and tools for building MT

pdf bib
Tuning general translation knowledge to a sublanguage
Michael Carl | Philippe Langlais
EAMT Workshop: Improving MT through other language technology tools: resources and tools for building MT

pdf bib
Phrase-based Evaluation of Word-to-Word Alignments
Michael Carl | Sisay Fissaha
Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond

pdf bib
Introduction à la traduction guidée par l’exemple (Traduction par analogie)
Michael Carl
Actes de la 10ème conférence sur le Traitement Automatique des Langues Naturelles. Tutoriels

Le nombre d’approches en traduction automatique s’est multiplié dans les dernières années. Il existe entre autres la traduction par règles, la traduction statistique et la traduction guidée par l’exemple. Dans cet article je decris les approches principales en traduction automatique. Je distingue les approches qui se basent sur des règles obtenues par l’inspection des approches qui se basent sur des exemples de traduction. La traduction guidée par l’exemple se caractérise par la phrase comme unité de traduction idéale. Une nouvelle traduction est génerée par analogie : seulement les parties qui changent par rapport à un ensemble de traductions connues sont adaptées, modifiées ou substituées. Je présente quelques techniques qui ont été utilisées pour ce faire. Je discuterai un système spécifique, EDGAR, plus en detail. Je démontrerai comment des textes traduits alignés peuvent être preparés en termes de compilation pour extraire des unités de traduction sous-phrastiques. Je présente des résultats en traduction Anglais -> Français produits avec le système EDGAR en les comparant avec ceux d’un système statistique.

2002

pdf bib
An Intelligent Terminology Database as a Pre-processor for Statistical Machine Translation
Michael Carl | Philippe Langlais
COLING-02: COMPUTERM 2002: Second International Workshop on Computational Terminology

pdf bib
Toward a hybrid integrated translation environment
Michael Carl | Andy Way | Reinhard Schäler
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers

In this paper we present a model for the future use of Machine Translation (MT) and Computer Assisted Translation. In order to accommodate the future needs in middle value translations, we discuss a number of MT techniques and architectures. We anticipate a hybrid environment that integrates data- and rule-driven approaches where translations will be routed through the available translation options and consumers will receive accurate information on the quality, pricing and time implications of their translation choice.

2001

pdf bib
Inducing probabilistic invertible translation grammars from aligned texts
Michael Carl
Proceedings of the ACL 2001 Workshop on Computational Natural Language Learning (ConLL)

pdf bib
Workshop on Example-Based machine Translation
Michael Carl | Andy Way
Workshop on Example-Based machine Translation

pdf bib
Inducing translation grammars from bracketed alignments
Michael Carl
Workshop on Example-Based machine Translation

2000

pdf bib
Combining invertible example-based machine translation with translation memory technology
Michael Carl
Proceedings of the Fourth Conference of the Association for Machine Translation in the Americas: Technical Papers

This paper presents an approach to extract invertible trans- lation examples from pre-aligned reference translations. The set of in- vertible translation examples is used in the Example-Based Machine Translation (EBMT) system EDGAR for translation. Invertible bilin- gual grammars eliminate translation ambiguities such that each source language parse tree maps into only one target language string. The trans- lation results of EDGAR are compared and combined with those of a translation memory (TM). It is shown that i) best translation results are achieved for the EBMT system when using a bilingual lexicon to sup- port the alignment process ii) TMs and EBMT-systems can be linked in a dynamical sequential manner and iii) the combined translation of TMs and EBMT is in any case better than each of the single system.

pdf bib
A Model of Competence for Corpus-Based Machine Translation
Michael Carl
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

1999

pdf bib
Inducing translation templates for example-based machine translation
Michael Carl
Proceedings of Machine Translation Summit VII

This paper describes an example-based machine translation (EBMT) system which relays on various knowledge resources. Morphologic analyses abstract the surface forms of the languages to be translated. A shallow syntactic rule formalism is used to percolate features in derivation trees. Translation examples serve the decomposition of the text to be translated and determine the transfer of lexical values into the target language. Translation templates determine the word order of the target language and the type of phrases (e.g. noun phrase, prepositional phase, ...) to be generated in the target language. An induction mechanism generalizes translation templates from translation examples. The paper outlines the basic idea underlying the EBMT system and investigates the possibilities and limits of the translation template induction process.

pdf bib
Linking translation memories with example-based machine translation
Michael Carl | Silvia Hansen
Proceedings of Machine Translation Summit VII

The paper reports on experiments which compare the translation outcome of three corpus-based MT systems, a string-based translation memory (STM), a lexeme-based translation memory (LTM) and the example-based machine translation (EBMT) system EDGAR. We use a fully automatic evaluation method to compare the outcome of each MT system and discuss the results. We investigate the benefits for the linkage of different MT strategies such as TMsystems and EBMT systems.

1998

pdf bib
A Constructivist Approach to Machine Translation
Michael Carl
New Methods in Language Processing and Computational Natural Language Learning

pdf bib
Shallow Post Morphological Processing with KURD
Michael Carl | Antje Schmidt-Wigger
New Methods in Language Processing and Computational Natural Language Learning

Search