Jesús González-Rubio

Also published as: Jesus Gonzalez-Rubio, Jesús González Rubio


2019

pdf bib
Webinterpret Submission to the WMT2019 Shared Task on Parallel Corpus Filtering
Jesús González-Rubio
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

This document describes the participation of Webinterpret in the shared task on parallel corpus filtering at the Fourth Conference on Machine Translation (WMT 2019). Here, we describe the main characteristics of our approach and discuss the results obtained on the data sets published for the shared task.

2018

pdf bib
MAJE Submission to the WMT2018 Shared Task on Parallel Corpus Filtering
Marina Fomicheva | Jesús González-Rubio
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper describes the participation of Webinterpret in the shared task on parallel corpus filtering at the Third Conference on Machine Translation (WMT 2018). The paper describes the main characteristics of our approach and discusses the results obtained on the data sets published for the shared task.

2016

pdf bib
Beyond Prefix-Based Interactive Translation Prediction
Jesús González-Rubio | Daniel Ortiz-Martínez | Francisco Casacuberta | José Miguel Benedi Ruiz
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning

2014

pdf bib
Evaluating the effects of interactivity in a post-editing workbench
Nancy Underwood | Bartolomé Mesa-Lao | Mercedes García Martínez | Michael Carl | Vicent Alabau | Jesús González-Rubio | Luis A. Leiva | Germán Sanchis-Trilles | Daniel Ortíz-Martínez | Francisco Casacuberta
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes the field trial and subsequent evaluation of a post-editing workbench which is currently under development in the EU-funded CasMaCat project. Based on user evaluations of the initial prototype of the workbench, this second prototype of the workbench includes a number of interactive features designed to improve productivity and user satisfaction. Using CasMaCat’s own facilities for logging keystrokes and eye tracking, data were collected from nine post-editors in a professional setting. These data were then used to investigate the effects of the interactive features on productivity, quality, user satisfaction and cognitive load as reflected in the post-editors’ gaze activity. These quantitative results are combined with the qualitative results derived from user questionnaires and interviews conducted with all the participants.

pdf bib
FBK-UPV-UEdin participation in the WMT14 Quality Estimation shared-task
José Guilherme Camargo de Souza | Jesús González-Rubio | Christian Buck | Marco Turchi | Matteo Negri
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf bib
CASMACAT: A Computer-assisted Translation Workbench
Vicent Alabau | Christian Buck | Michael Carl | Francisco Casacuberta | Mercedes García-Martínez | Ulrich Germann | Jesús González-Rubio | Robin Hill | Philipp Koehn | Luis Leiva | Bartolomé Mesa-Lao | Daniel Ortiz-Martínez | Herve Saint-Amand | Germán Sanchis Trilles | Chara Tsoukala
Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Inference of Phrase-Based Translation Models via Minimum Description Length
Jesús González-Rubio | Francisco Casacuberta
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

pdf bib
Integrating online and active learning in a computer-assisted translation workbench
Vicent Alabau | Jesús González-Rubio | Daniel Ortiz-Martínez | Germán Sanchis-Trilles | Francisco Casacuberta | Mercedes García-Martínez | Bartolomé Mesa-Lao | Dan Cheung Petersen | Barbara Dragsted | Michael Carl
Workshop on interactive and adaptive machine translation

This paper describes a pilot study with a computed-assisted translation workbench aiming at testing the integration of online and active learning features. We investigate the effect of these features on translation productivity, using interactive translation prediction (ITP) as a baseline. User activity data were collected from five beta testers using key-logging and eye-tracking. User feedback was also collected at the end of the experiments in the form of retrospective think-aloud protocols. We found that OL performs better than ITP, especially in terms of translation speed. In addition, AL provides better translation quality than ITP for the same levels of user effort. We plan to incorporate these features in the final version of the workbench.

2013

pdf bib
Interactive Machine Translation using Hierarchical Translation Models
Jesús González-Rubio | Daniel Ortiz-Martínez | José-Miguel Benedí | Francisco Casacuberta
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Improving the minimum Bayes’ risk combination of machine translation systems
Jesús González-Rubio | Francisco Casacuberta
Proceedings of the 10th International Workshop on Spoken Language Translation: Papers

We investigate the problem of combining the outputs of different translation systems into a minimum Bayes’ risk consensus translation. We explore different risk formulations based on the BLEU score, and provide a dynamic programming decoding algorithm for each of them. In our experiments, these algorithms generated consensus translations with better risk, and more efficiently, than previous proposals.

pdf bib
Emprical study of a two-step approach to estimate translation quality
Jesús González-Rubio | J. Ramón Navarro-Cerdán | Francisco Casacuberta
Proceedings of the 10th International Workshop on Spoken Language Translation: Papers

We present a method to estimate the quality of automatic translations when reference translations are not available. Quality estimation is addressed as a two-step regression problem where multiple features are combined to predict a quality score. Given a set of features, we aim at automatically extracting the variables that better explain translation quality, and use them to predict the quality score. The soundness of our approach is assessed by the encouraging results obtained in an exhaustive experimentation with several feature sets. Moreover, the studied approach is highly-scalable allowing us to employ hundreds of features to predict translation quality.

pdf bib
User Evaluation of Advanced Interaction Features for a Computer-Assisted Translation Workbench
Vicente Alabau | Jesus Gonzalez-Rubio | Luis A. Leiva | Daniel Ortiz-Martínez | German Sanchis-Trilles | Francisco Casacuberta | Bartolomé Mesa-Lao | Ragnar Bonk | Michael Carl | Mercedes Garcia-Martinez
Proceedings of Machine Translation Summit XIV: User track

2012

pdf bib
PRHLT Submission to the WMT12 Quality Estimation Task
Jesús González Rubio | Alberto Sanchis | Francisco Casacuberta
Proceedings of the Seventh Workshop on Statistical Machine Translation

pdf bib
Active learning for interactive machine translation
Jesús González-Rubio | Daniel Ortiz-Martínez | Francisco Casacuberta
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Bilingual segmentation for phrasetable pruning in Statistical Machine Translation
Germán Sanchis-Trilles | Daniel Ortiz-Martínez | Jesús González-Rubio | Jorge González
Proceedings of the 15th Annual Conference of the European Association for Machine Translation

pdf bib
Minimum Bayes-risk System Combination
Jesús González-Rubio | Alfons Juan | Francisco Casacuberta
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
The UPV-PRHLT combination system for WMT 2011
Jesús González-Rubio | Francisco Casacuberta
Proceedings of the Sixth Workshop on Statistical Machine Translation

2010

pdf bib
Balancing User Effort and Translation Error in Interactive Machine Translation via Confidence Measures
Jesús González-Rubio | Daniel Ortiz-Martínez | Francisco Casacuberta
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
ITI-UPV machine translation system for IWSLT 2010
Guillem Gascó | Vicent Alabau | Jesús-Andrés Ferrer | Jesús González-Rubio | Martha-Alicia Rocha | Germán Sanchis-Trilles | Francisco Casacuberta | Jorge González | Joan-Andreu Sánchez
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper presents the submissions of the PRHLT group for the evaluation campaign of the International Workshop on Spoken Language Translation. We focus on the development of reliable translation systems between syntactically different languages (DIALOG task) and on the efficient training of SMT models in resource-rich scenarios (TALK task).

pdf bib
UPV-PRHLT English–Spanish System for WMT10
Germán Sanchis-Trilles | Jesús Andrés-Ferrer | Guillem Gascó | Jesús González-Rubio | Pascual Martínez-Gómez | Martha-Alicia Rocha | Joan-Andreu Sánchez | Francisco Casacuberta
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
The UPV-PRHLT Combination System for WMT 2010
Jesús González-Rubio | Germán Sanchis-Trilles | Joan-Andreu Sánchez | Jesús Andrés-Ferrer | Guillem Gascó | Pascual Martínez-Gómez | Martha-Alicia Rocha | Francisco Casacuberta
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib
On the Use of Confidence Measures within an Interactive-predictive Machine Translation System
Jesús González-Rubio | Daniel Ortíz-Martínez | Francisco Casacuberta
Proceedings of the 14th Annual Conference of the European Association for Machine Translation

pdf bib
Saturnalia: A Latin-Catalan Parallel Corpus for Statistical MT
Jesús González-Rubio | Jorge Civera | Alfons Juan | Francisco Casacuberta
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Currently, a great effort is being carried out in the digitalisation of large historical document collections for preservation purposes. The documents in these collections are usually written in ancient languages, such as Latin or Greek, which limits the access of the general public to their content due to the language barrier. Therefore, digital libraries aim not only at storing raw images of digitalised documents, but also to annotate them with their corresponding text transcriptions and translations into modern languages. Unfortunately, ancient languages have at their disposal scarce electronic resources to be exploited by natural language processing techniques. This paper describes the compilation process of a novel Latin-Catalan parallel corpus as a new task for statistical machine translation (SMT). Preliminary experimental results are also reported using a state-of-the-art phrase-based SMT system. The results presented in this work reveal the complexity of the task and its challenging, but interesting nature for future development.

2008

pdf bib
A novel alignment model inspired on IBM Model 1
Jesús González-Rubio | Germán Sanchis-Trilles | Alfons Juan | Francisco Casacuberta
Proceedings of the 12th Annual Conference of the European Association for Machine Translation