Nano Gough


2004

pdf bib
Robust large-scale EBMT with marker-based segmentation
Nano Gough | Andy Way
Proceedings of the 10th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

pdf bib
Example-based controlled translation
Nano Gough | Andy Way
Proceedings of the 9th EAMT Workshop: Broadening horizons of machine translation and its applications

2003

pdf bib
wEBMT: Developing and Validating an Example-Based Machine Translation System using the World Wide Web
Andy Way | Nano Gough
Computational Linguistics, Volume 29, Number 3, September 2003: Special Issue on the Web as Corpus

pdf bib
Controlled generation in example-based machine translation
Nano Gough | Andy Way
Proceedings of Machine Translation Summit IX: Papers

The theme of controlled translation is currently in vogue in the area of MT. Recent research (Scha ̈ler et al., 2003; Carl, 2003) hypothesises that EBMT systems are perhaps best suited to this challenging task. In this paper, we present an EBMT system where the generation of the target string is filtered by data written according to controlled language specifications. As far as we are aware, this is the only research available on this topic. In the field of controlled language applications, it is more usual to constrain the source language in this way rather than the target. We translate a small corpus of controlled English into French using the on-line MT system Logomedia, and seed the memories of our EBMT system with a set of automatically induced lexical resources using the Marker Hypothesis as a segmentation tool. We test our system on a large set of sentences extracted from a Sun Translation Memory, and provide both an automatic and a human evaluation. For comparative purposes, we also provide results for Logomedia itself.

pdf bib
Teaching and assessing empirical approaches to machine translation
Andy Way | Nano Gough
Workshop on Teaching Translation Technologies and Tools

Empirical methods in Natural Language Processing (NLP) and Machine Translation (MT) have become mainstream in the research field. Accordingly, it is important that the tools and techniques in these paradigms be taught to potential future researchers and developers in University courses. While many dedicated courses on Statistical NLP can be found, there are few, if any courses on Empirical Approaches to MT. This paper presents the development and assessment of one such course as taught to final year undergraduates taking a degree in NLP.

2002

pdf bib
Example-based machine translation via the Web
Nano Gough | Andy Way | Mary Hearne
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers

One of the limitations of translation memory systems is that the smallest translation units currently accessible are aligned sentential pairs. We propose an example-based machine translation system which uses a ‘phrasal lexicon’ in addition to the aligned sentences in its database. These phrases are extracted from the Penn Treebank using the Marker Hypothesis as a constraint on segmentation. They are then translated by three on-line machine translation (MT) systems, and a number of linguistic resources are automatically constructed which are used in the translation of new input. We perform two experiments on testsets of sentences and noun phrases to demonstrate the effectiveness of our system. In so doing, we obtain insights into the strengths and weaknesses of the selected on-line MT systems. Finally, like many example-based machine translation systems, our approach also suffers from the problem of ‘boundary friction’. Where the quality of resulting translations is compromised as a result, we use a novel, post hoc validation procedure via the World Wide Web to correct imperfect translations prior to their being output to the user.