Example-based machine translation via the Web

Nano Gough, Andy Way, Mary Hearne


Abstract
One of the limitations of translation memory systems is that the smallest translation units currently accessible are aligned sentential pairs. We propose an example-based machine translation system which uses a ‘phrasal lexicon’ in addition to the aligned sentences in its database. These phrases are extracted from the Penn Treebank using the Marker Hypothesis as a constraint on segmentation. They are then translated by three on-line machine translation (MT) systems, and a number of linguistic resources are automatically constructed which are used in the translation of new input. We perform two experiments on testsets of sentences and noun phrases to demonstrate the effectiveness of our system. In so doing, we obtain insights into the strengths and weaknesses of the selected on-line MT systems. Finally, like many example-based machine translation systems, our approach also suffers from the problem of ‘boundary friction’. Where the quality of resulting translations is compromised as a result, we use a novel, post hoc validation procedure via the World Wide Web to correct imperfect translations prior to their being output to the user.
Anthology ID:
2002.amta-papers.8
Volume:
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
October 8-12
Year:
2002
Address:
Tiburon, USA
Venue:
AMTA
SIG:
Publisher:
Springer
Note:
Pages:
74–83
Language:
URL:
https://link.springer.com/chapter/10.1007/3-540-45820-4_8
DOI:
Bibkey:
Copy Citation:
PDF:
https://link.springer.com/chapter/10.1007/3-540-45820-4_8