Conference of the Association for Machine Translation in the Americas (2004)


Case study: implementing MT for the translation of pre-sales marketing and post-sales software deployment documentation at Mycom International
Jeffrey Allen

Several major telecommunications companies have made significant investment in either controlled language and/or machine translation over the past 10 years.

A speech-to-speech translation system for Catalan, Spanish, and English
Victoria Arranz | Elisabet Comelles | David Farwell | Climent Nadeu | Jaume Padrell | Albert Febrer | Dorcas Alexander | Kay Peterson

In this paper we describe the FAME interlingual speech-to- speech translation System for Spanish, Catalan and English which is intended to assist users in the reservation of a hotel room when calling or visiting abroad. The System has been developed as an extension of the existing NESPOLE! translation system [4] which translates between English, German, Italian and French. After a brief introduction we describe the Spanish and Catalan System components including speech recognition, transcription to IF mapping, IF to text generation and speech synthesis. We also present a task-oriented evaluation method used to inform about system development and some preliminary results.

Multi-Align: combining linguistic and statistical techniques to improve alignments for adaptable MT
Necip Fazil Ayan | Bonnie Dorr | Nizar Habash

An adaptable statistical or hybrid MT system relies heavily on the quality of word-level alignments of real-world data. Statistical alignment approaches provide a reasonable initial estimate for word alignment. However, they cannot handle certain types of linguistic phenomena such as long-distance dependencies and structural differences between languages. We address this issue in Multi-Align, a new framework for incremental testing of different alignment algorithms and their combinations. Our design allows users to tune their systems to the properties of a particular genre/domain while still benefiting from general linguistic knowledge associated with a language pair. We demonstrate that a combination of statistical and linguistically-informed alignments can resolve translation divergences during the alignment process.

A modified Burrows-Wheeler transform for highly scalable example-based translation
Ralf D. Brown

The Burrows-Wheeler Transform (BWT) was originally developed for data compression, but can also be applied to indexing text. In this paper, an adaptation of the BWT to word-based indexing of the training corpus for an example-based machine translation (EBMT) system is presented. The adapted BWT embeds the necessary information to retrieve matched training instances without requiring any additional space and can be instantiated in a compressed form which reduces disk space and memory requirements by about 40% while still remaining searchable without decompression. Both the speed advantage from O(log N) lookups compared to the O(N) lookups in the inverted-file index which had previously been used and the structure of the index itself act as enablers for additional capabilities and run-time speed. Because the BWT groups all instances of any n-gram together, it can be used to quickly enumerate the most-frequent n-grams, for which translations can be precomputed and stored, resulting in an order-of-magnitude speedup at run time.

Designing a controlled language for the machine translation of medical protocols: the case of English to Chinese
Sylviane Cardey | Peter Greenfield | Xiahong Wu

Because of its clarity and its simplified way of writing, controlled language (CL) is being paid increasing attention by NLP (natural language processing) researchers, such as in machine translation. The users of controlled languages are of two types, firstly the authors of documents written in the controlled language and secondly the end-user readers of the documents. As a subset of natural language, controlled language restricts vocabulary, grammar, and style for the purpose of reducing or eliminating both ambiguity and complexity. The use of controlled language can help decrease the complexity of natural language to a certain degree and thus improve the translation quality, especially for the partial or total automatic translation of non-general purpose texts, such as technical documents, manuals, instructions and medical reports. Our focus is on the machine translation of medical protocols applied in the field of zoonosis. In this article we will briefly introduce why controlled language is preferred in our research work, what kind of benefits it will bring to our work and how we could make use of this existing technique to facilitate our translation tool.

Normalizing German and English inflectional morphology to improve statistical word alignment
Simon Corston-Oliver | Michael Gamon

German has a richer system of inflectional morphology than English, which causes problems for current approaches to statistical word alignment. Using Giza++ as a reference implementation of the IBM Model 1, an HMMbased alignment and IBM Model 4, we measure the impact of normalizing inflectional morphology on German-English statistical word alignment. We demonstrate that normalizing inflectional morphology improves the perplexity of models and reduces alignment errors.

System description: a highly interactive speech-to-speech translation system
Mike Dillinger | Mark Seligman

Spoken Translation, Inc. (STI) of Berkeley, CA has developed a commercial system for interactive speech-to-speech machine translation designed for both high accuracy and broad linguistic and topical coverage. Planned use is in situations requiring both of these features, for example in helping Spanish-speaking patients to communicate with English-speaking doctors, nurses, and other health-care staff.

A fluency error categorization scheme to guide automated machine translation evaluation
Debbie Elliott | Anthony Hartley | Eric Atwell

Existing automated MT evaluation methods often require expert human translations. These are produced for every language pair evaluated and, due to this expense, subsequent evaluations tend to rely on the same texts, which do not necessarily reflect real MT use. In contrast, we are designing an automated MT evaluation system, intended for use by post-editors, purchasers and developers, that requires nothing but the raw MT output. Furthermore, our research is based on texts that reflect corporate use of MT. This paper describes our first step in system design: a hierarchical classification scheme of fluency errors in English MT output, to enable us to identify error types and frequencies, and guide the selection of errors for automated detection. We present results from the statistical analysis of 20,000 words of MT output, manually annotated using our classification scheme, and describe correlations between error frequencies and human scores for fluency and adequacy.

Online MT services and real users’ needs: an empirical usability evaluation
Federico Gaspari

This paper presents an empirical evaluation of the main usability factors that play a significant role in the interaction with on-line Machine Translation (MT) services. The investigation is carried out from the point of view of typical users with an emphasis on their real needs, and focuses on a set of key usability criteria that have an impact on the successful deployment of Internet-based MT technology. A small-scale evaluation of the performance of five popular web-based MT systems against the selected usability criteria shows that different approaches to interaction design can dramatically affect the level of user satisfaction. There are strong indications that the results of this study can be fed back into the development of on-line MT services to enhance their design, thus ensuring that they meet the requirements and expectations of a wide range of Internet users.

Counting, measuring, ordering: translation problems and solutions
Stephen Helmreich | David Farwell

This paper describes some difficulties associated with the translation of numbers (scalars) used for counting, measuring, or selecting items or properties. A set of problematic issues is described, and the presence of these difficulties is quantified by examining a set of texts and translations. An approach to a solution is suggested.

Feedback from the field: the challenge of users in motion
L. Hernandez | J. Turner | M. Holland

Feedback from field deployments of machine translation (MT) is instructive but hard to obtain, especially in the case of soldiers deployed in mobile and stressful environments. We first consider the process of acquiring feedback: the difficulty of getting and interpreting it, the kinds of information that have been used in place of or as predictors of direct feedback, and the validity and completeness of that information. We then look at how to better forecast the utility of MT in deployments so that feedback from the field is focused on aspects that can be fixed or enhanced rather than on overall failure or viability of the technology. We draw examples from document and speech translation.

The Georgetown-IBM experiment demonstrated in January 1954
W. John Hutchins

The public demonstration of a Russian-English machine translation system in New York in January 1954 – a collaboration of IBM and Georgetown University – caused a great deal of public interest and much controversy. Although a small-scale experiment of just 250 words and six ‘grammar’ rules it raised expectations of automatic systems capable of high quality translation in the near future. This paper describes the system, its background, its impact and its implications.

Pharaoh: a beam search decoder for phrase-based statistical machine translation models
Philipp Koen

We describe Pharaoh, a freely available decoder for phrase-based statistical machine translation models. The decoder is the implement at ion of an efficient dynamic programming search algorithm with lattice generation and XML markup for external components.

The PARS family of machine translation systems for Dutch system description/ demonstration
Edward A. Kool | Michael S. Blekhman | Andrei Kursin | Alla Rakova

Lingvistica is developing a family of MT systems for Dutch to and from English, German, and French. PARS/H, a Dutch to and from English system, is a fully commercial product, while PARS/HD, for Dutch to and from German MT, and PARS/HF, for Dutch to and from French, are under way. The PARS/Dutch family of MT systems is based on the rule-based Lingvistica’s Dutch morphological-syntactic analyzer and synthesizer dealing with vowel and consonant alterations in Dutch words, as well as Dutch syntactic analysis and synthesis. Besides, a German analyzer and synthesizer have been developed, and a similar French one is being constructed. Representative Dutch and German grammatical dictionaries have been created, comprising Dutch and German words and their complete morphological descriptions: class and subclass characteristics, alteration features, and morphological declension/conjugation paradigms. The PARS/H dictionary editor provides simple dictionary updating. Numerous specialist dictionaries are being and have been created. The user interface integrates PARS/H with MS Word and MS Internet Explorer, fully preserving the corresponding formats. Integrating with MS Excel and many other applications is under way.

Rapid MT experience in an LCTL (Pashto)
Craig Kopris

A year ago we were faced with a challenge: rapidly develop a machine translation (MT) system for written Pashto with limited resources. We had three full-time native speakers (one with a Ph.D. in general linguistics, and translation experience) and one part-time descriptive linguist with a typological-functional background. In addition, we had a legacy MT software system, which neither the speakers nor the linguist was familiar with, although we had the opportunity to occasionally confer with experienced system users. There were also dated published grammars of varying (usually inadequate) quality available.

The significance of recall in automatic metrics for MT evaluation
Alon Lavie | Kenji Sagae | Shyamsundar Jayaraman

Recent research has shown that a balanced harmonic mean (F1 measure) of unigram precision and recall outperforms the widely used BLEU and NIST metrics for Machine Translation evaluation in terms of correlation with human judgments of translation quality. We show that significantly better correlations can be achieved by placing more weight on recall than on precision. While this may seem unexpected, since BLEU and NIST focus on n-gram precision and disregard recall, our experiments show that correlation with human judgments is highest when almost all of the weight is assigned to recall. We also show that stemming is significantly beneficial not just to simpler unigram precision and recall based metrics, but also to BLEU and NIST.

Alignment of bilingual named entities in parallel corpora using statistical model
Chun-Jen Lee | Jason S. Chang | Thomas C. Chuang

Named entities make up a bulk of documents. Extracting named entities is crucial to various applications of natural language processing. Although efforts to identify named entities within monolingual documents are numerous, extracting bilingual named entities has not been investigated extensively owing to the complexity of the task. In this paper, we describe a statistical phrase translation model and a statistical transliteration model. Under the proposed models, a new method is proposed to align bilingual named entities in parallel corpora. Experimental results indicate that a satisfactory precision rate can be achieved. To enhance the performance, we also describe how to improve the proposed method by incorporating approximate matching and person name recognition. Experimental results show that performance is significantly improved with the enhancement.

Weather report translation using a translation memory
Thomas Leplus | Philippe Langlais | Guy Lapalme

We describe the use of a translation memory in the context of a reconstruction of a landmark application of machine translation, the Canadian English to French weather report translation system. This system, which has been in operation for more than 20 years, was developed using a classical symbolic approach. We describe our experiment in developing an alternative approach based on the analysis of hundreds of thousands of weather reports. We show that it is possible to obtain excellent translations using translation memory techniques and we analyze the kinds of translation errors that are induced by this approach.

Keyword translation from English to Chinese for multilingual QA
Frank Lin | Teruko Mitamura

The Keyword Translator is a part of the Question Analyzer module in the JAVELIN Question-Answering system; it translates the keywords, which are used to query documents and extract answers, from one language to another. Much work has been in the area of query translation for CLIR or MLIR, however, many have focused on methods using hard-to-obtain and domain-specific resources, and evaluation is often based on retrieval performance rather than translation correctness. In this paper we will describe methods combining easily accessible, general-purpose MT systems to improve keyword translation correctness. We also describe methods that utilize the question sentence available to a question-answering system to improve translation correctness. We will show that using multiple MT systems and the question sentence to translate keywords from English to Mandarin Chinese can significantly improve keyword translation correctness.

Extraction of name and transliteration in monolingual and parallel corpora
Tracy Lin | Jian-Cheng Wu | Jason S. Chang

Named-entities in free text represent a challenge to text analysis in Machine Translation and Cross Language Information Retrieval. These phrases are often transliterated into another language with a different sound inventory and writing system. Named-entities found in free text are often not listed in bilingual dictionaries. Although it is possible to identify and translate named-entities on the fly without a list of proper names and transliterations, an extensive list of existing transliterations certainly will ensure high precision rate. We use a seed list of proper names and transliterations to train a Machine Transliteration Model. With the model it is possible to extract proper names and their transliterations in monolingual or parallel corpora with high precision and recall rates.

Error analysis of two types of grammar for the purpose of automatic rule refinement
Ariadna Font Llitjós | Katharina Probst | Jaime Carbonell

This paper compares a manually written MT grammar and a grammar learned automatically from an English-Spanish elicitation corpus with the ultimate purpose of automatically refining the translation rules. The experiment described here shows that the kind of automatic refinement operations required to correct a translation not only varies depending on the type of error, but also on the type of grammar. This paper describes the two types of grammars and gives a detailed error analysis of their output, indicating what kinds of refinements are required in each case.

The contribution of end-users to the TransType2 project
Elliott Macklovitch

TransType2 is a novel kind of interactive MT in which the system and the user collaborate in drafting a target text, the system’s contribution taking the form of predictions that extend what the translator has already typed in. TT2 is also an international research project in which end-users are represented by two translation firms. We describe the contribution of these translators to the project, from their input to the system’s functional specifications to their participation in quarterly user trials. We also present the results of the latest round of user trials.

An experiment on Japanese-Uighur machine translation and its evaluation
Muhtar Mahsut | Yasuhiro Ogawa | Kazue Sugino | Katsuhiko Toyama | Yasuyoshi Inagaki

This paper describes an evaluation experiment about a Japanese-Uighur machine translation system which consists of verbal suffix processing, case suffix processing, phonetic change processing, and a Japanese-Uighur dictionary including about 20,000 words. Japanese and Uighur have many syntactical and language structural similarities, including word order, existence and same functions of case suffixes and verbal suffixes, morphological structure, etc. For these reasons, we can consider that we can translate Japanese into Uighur in such a manner as word-by-word aligning after morphological analysis of the input sentences without complicated syntactical analysis. From the point of view of practical usage, we have chosen three articles about environmental issue appeared in Nippon Keizai Shinbun, and conducted a translation experiment on the articles with our MT system, for clarifying our argument. Here, we have counted the correctness of phrases in the Output sentences to be evaluating criteria. As a results of the experiment, 84.8% of precision has been achieved.

A structurally diverse minimal corpus for eliciting structural mappings between languages
Katharina Probst | Alon Lavie

We describe an approach to creating a small but diverse corpus in English that can be used to elicit information about any target language. The focus of the corpus is on structural information. The resulting bilingual corpus can then be used for natural language processing tasks such as inferring transfer mappings for Machine Translation. The corpus is sufficiently small that a bilingual user can translate and word-align it within a matter of hours. We describe how the corpus is created and how its structural diversity is ensured. We then argue that it is not necessary to introduce a large amount of redundancy into the corpus. This is shown by creating an increasingly redundant corpus and observing that the information gained converges as redundancy increases.

Investigation of intelligibility judgments
Florence Reeder

This paper describes an intelligibility snap-judgment test. In this exercise, participants are shown a series of human translations and machine translations and are asked to determine whether the author was human or machine. The experiment shows that snap judgments on intelligibility are made successfully and that system rankings on snap judgments are consistent with more detailed intelligibility measures. In addition to demonstrating a quick intelligibility judgment, representing on a few minutes time of each participant, it details the types of errors which led to the snap judgments.

Interlingual annotation for MT development
Florence Reeder | Bonnie Dorr | David Farwell | Nizar Habash | Stephen Helmreich | Eduard Hovy | Lori Levin | Teruko Mitamura | Keith Miller | Owen Rambow | Advaith Siddharthan

MT systems that use only superficial representations, including the current generation of statistical MT systems, have been successful and useful. However, they will experience a plateau in quality, much like other “silver bullet” approaches to MT. We pursue work on the development of interlingual representations for use in symbolic or hybrid MT systems. In this paper, we describe the creation of an interlingua and the development of a corpus of semantically annotated text, to be validated in six languages and evaluated in several ways. We have established a distributed, well-functioning research methodology, designed a preliminary interlingua notation, created annotation manuals and tools, developed a test collection in six languages with associated English translations, annotated some 150 translations, and designed and applied various annotation metrics. We describe the data sets being annotated and the interlingual (IL) representation language which uses two ontologies and a systematic theta-role list. We present the annotation tools built and outline the annotation process. Following this, we describe our evaluation methodology and conclude with a summary of issues that have arisen.

Machine translation of online product support articles using data-driven MT system
Stephen D. Richardson

At AMTA 2002, we reported on a pilot project to machine translate Microsoft’s Product Support Knowledge Base into Spanish. The successful pilot has since resulted in the permanent deployment of both Spanish and Japanese versions of the knowledge base, as well as ongoing pilot projects for French and German. The translated articles in each case have been produced by MSR-MT, Microsoft Research’s data-driven MT system, which has been trained on well over a million bilingual sentence pairs for each target language from previously translated materials contained in translation memories and glossaries. This paper describes our experience in deploying this system and the (positive) customer response to the availability of machine translated articles, as well as other uses of MSR-MT either planned or underway at Microsoft.

Maintenance issues for machine translation systems
Nestor Rychtyckyj

At AMTA-2002 we presented a deployed application of Machine Translation (MT) at Ford Motor Company in the domain of vehicle assembly process planning. This application uses an MT system developed by SYSTRAN to translate Ford’s manufacturing process build instructions from English to Spanish, German, Dutch and Portuguese. Our MT system has already translated over 2 million instructions into these target languages and is an integral part of our manufacturing process planning to support Ford’s assembly plants in Europe, Mexico and South America. A major component of the MT system development was the creation of a set of technical glossaries for the correct translation of automotive and Ford-specific terminology. Due to the dynamic nature of the automobile industry we need to keep these technical glossaries current as our terminology frequently changes due to the introduction of new manufacturing technologies, vehicles and vehicle features. In addition, our end-users need to be able to test and modify translations and see these results deployed in a timely manner. In this paper we will discuss the tools and business process that we have developed in conjunction with SYSTRAN in order to maintain and customize our MT system and improve its performance in the face of an ever-changing business environment.

Improving domain-specific word alignment with a general bilingual corpus
Hua Wu | Haifeng Wang

In conventional word alignment methods, some employ statistical models or statistical measures, which need large-scale bilingual sentence-aligned training corpora. Others employ dictionaries to guide alignment selection. However, these methods achieve unsatisfactory alignment results when performing word alignment on a small-scale domain-specific bilingual corpus without terminological lexicons. This paper proposes an approach to improve word alignment in a specific domain, in which only a small-scale domain-specific corpus is available, by adapting the word alignment information in the general domain to the specific domain. This approach first trains two statistical word alignment models with the large-scale corpus in the general domain and the small-scale corpus in the specific domain respectively, and then improves the domain-specific word alignment with these two models. Experimental results show a significant improvement in terms of both alignment precision and recall, achieving a relative error rate reduction of 21.96% as compared with state-of-the-art technologies.

A super-function based Japanese-Chinese machine translation system for business users
Xin Zhao | Fuji Ren | Stefan Voß

In this paper, a Japanese-Chinese Machine Translation (MT) system using the so-called Super-Function (SF) approach is presented. A SF is a functional relation mapping sentences from one language to another. The core of the system uses the SF approach to translate without going through syntactic and semantic analysis as many MT systems usually do. Our work focuses on business users for whom MT often is a great help if they need an immediate idea of the content of texts like e-mail messages, reports, web pages, or business letters. In this paper, we aim at performing MT between Japanese and Chinese to translate business letters by the SF based technique.