On the use of statistical machine-translation techniques within a memory-based translation system (AMETRA)

Daniel Ortíz, Ismael García-Varea, Francisco Casacuberta, Antonio Lagarda, Jorge González


Abstract
The goal of the AMETRA project is to make a computer-assisted translation tool from the Spanish language to the Basque language under the memory-based translation framework. The system is based on a large collection of bilingual word-segments. These segments are obtained using linguistic or statistical techniques from a Spanish-Basque bilingual corpus consisting of sentences extracted from the Basque Country’s of£cial government record. One of the tasks within the global information document of the AMETRA project is to study the combination of well-known statistical techniques for the translation of short sequences and techniques for memory-based translation. In this paper, we address the problem of constructing a statistical module to deal with the task of translating segments. The task undertaken in the AMETRA project is compared with other existing translation tasks, This study includes the results of some preliminary experiments we have carried out using well-known statistical machine translation tools and techniques.
Anthology ID:
2003.mtsummit-papers.40
Volume:
Proceedings of Machine Translation Summit IX: Papers
Month:
September 23-27
Year:
2003
Address:
New Orleans, USA
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2003.mtsummit-papers.40
DOI:
Bibkey:
Cite (ACL):
Daniel Ortíz, Ismael García-Varea, Francisco Casacuberta, Antonio Lagarda, and Jorge González. 2003. On the use of statistical machine-translation techniques within a memory-based translation system (AMETRA). In Proceedings of Machine Translation Summit IX: Papers, New Orleans, USA.
Cite (Informal):
On the use of statistical machine-translation techniques within a memory-based translation system (AMETRA) (Ortíz et al., MTSummit 2003)
Copy Citation:
PDF:
https://aclanthology.org/2003.mtsummit-papers.40.pdf