Mireia Ginestí-Rosell

Also published as: Mireia Ginestí Rosell


pdf bib
Deriving de/het gender classification for Dutch nouns for rule-based MT generation tasks
Bogdan Babych | Jonathan Geiger | Mireia Ginestí Rosell | Kurt Eberle
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra)


pdf bib
Design of a hybrid high quality machine translation system
Bogdan Babych | Kurt Eberle | Johanna Geiß | Mireia Ginestí-Rosell | Anthony Hartley | Reinhard Rapp | Serge Sharoff | Martin Thomas
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)


pdf bib
An Italian to Catalan RBMT system reusing data from existing language pairs
Antonio Toral | Mireia Ginestí-Rosell | Francis Tyers
Proceedings of the Second International Workshop on Free/Open-Source Rule-Based Machine Translation

This paper presents an Italian→Catalan RBMT system automatically built by combining the linguistic data of the existing pairs Spanish–Catalan and Spanish–Italian. A lightweight manual postprocessing is carried out in order to fix inconsistencies in the automatically derived dictionaries and to add very frequent words that are missing according to a corpus analysis. The system is evaluated on the KDE4 corpus and outperforms Google Translate by approximately ten absolute points in terms of both TER and GTM.


pdf bib
Joint efforts to further develop and incorporate Apertium into the document management flow at Universitat Oberta de Catalunya
Luis Villarejo Muñoz | Sergio Ortiz Rojas | Mireia Ginestí Rosell
Proceedings of the First International Workshop on Free/Open-Source Rule-Based Machine Translation

This article describes the needs of UOC regarding translation and how these needs are satisfied by Prompsit further developing a free rule-based machine translation system: Apertium. We initially describe the general framework regarding linguistic needs inside UOC. Then, section 2 introduces Apertium and outlines the development scenario that Prompsit executed. After that, section 3 outlines the specific needs of UOC and why Apertium was chosen as the machine translation engine. Then, section 4 describes some of the features specially developed in this project. Section 5 explains how the linguistic data was improved to increase the quality of the output in Catalan and Spanish. And, finally, we draw conclusions and outline further work originating from the project.


pdf bib
An Open-Source Shallow-Transfer Machine Translation Toolbox: Consequences of Its Release and Availability
Carme Armentano-Oller | Antonio M. Corbí-Bellot | Mikel L. Forcada | Mireia Ginestí-Rosell | Boyan Bonev | Sergio Ortiz-Rojas | Juan Antonio Pérez-Ortiz | Gema Ramírez-Sánchez | Felipe Sánchez-Martínez
Workshop on open-source machine translation

By the time Machine Translation Summit X is held in September 2005, our group will have released an open-source machine translation toolbox as part of a large government-funded project involving four universities and three linguistic technology companies from Spain. The machine translation toolbox, which will most likely be released under a GPL-like license includes (a) the open-source engine itself, a modular shallow-transfer machine translation engine suitable for related languages and largely based upon that of systems we have already developed, such as interNOSTRUM for Spanish—Catalan and Traductor Universia for Spanish—Portuguese, (b) extensive documentation (including document type declarations) specifying the XML format of all linguistic (dictionaries, rules) and document format management files, (c) compilers converting these data into the high-speed (tens of thousands of words a second) format used by the engine, and (d) pilot linguistic data for Spanish—Catalan and Spanish—Galician and format management specifications for the HTML, RTF and plain text formats. After describing very briefly this toolbox, this paper aims at exploring possible consequences of the availability of this architecture, including the community-driven development of machine translation systems for languages lacking this kind of linguistic technology.