Jindřich Helcl

Also published as: Jindrich Helcl


pdf bib
Non-Autoregressive Machine Translation: It’s Not as Fast as it Seems
Jindřich Helcl | Barry Haddow | Alexandra Birch
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Efficient machine translation models are commercially important as they can increase inference speeds, and reduce costs and carbon emissions. Recently, there has been much interest in non-autoregressive (NAR) models, which promise faster translation. In parallel to the research on NAR models, there have been successful attempts to create optimized autoregressive models as part of the WMT shared task on efficient translation. In this paper, we point out flaws in the evaluation methodology present in the literature on NAR models and we provide a fair comparison between a state-of-the-art NAR model and the autoregressive submissions to the shared task. We make the case for consistent evaluation of NAR models, and also for the importance of comparing NAR models with other widely used methods for improving efficiency. We run experiments with a connectionist-temporal-classification-based (CTC) NAR model implemented in C++ and compare it with AR models using wall clock times. Our results show that, although NAR models are faster on GPUs, with small batch sizes, they are almost always slower under more realistic usage conditions. We call for more realistic and extensive evaluation of NAR models in future work.


pdf bib
The University of Edinburgh’s English-German and English-Hausa Submissions to the WMT21 News Translation Task
Pinzhen Chen | Jindřich Helcl | Ulrich Germann | Laurie Burchell | Nikolay Bogoychev | Antonio Valerio Miceli Barone | Jonas Waldendorf | Alexandra Birch | Kenneth Heafield
Proceedings of the Sixth Conference on Machine Translation

This paper presents the University of Edinburgh’s constrained submissions of English-German and English-Hausa systems to the WMT 2021 shared task on news translation. We build En-De systems in three stages: corpus filtering, back-translation, and fine-tuning. For En-Ha we use an iterative back-translation approach on top of pre-trained En-De models and investigate vocabulary embedding mapping.

pdf bib
Surprise Language Challenge: Developing a Neural Machine Translation System between Pashto and English in Two Months
Alexandra Birch | Barry Haddow | Antonio Valerio Miceli Barone | Jindrich Helcl | Jonas Waldendorf | Felipe Sánchez Martínez | Mikel Forcada | Víctor Sánchez Cartagena | Juan Antonio Pérez-Ortiz | Miquel Esplà-Gomis | Wilker Aziz | Lina Murady | Sevi Sariisik | Peggy van der Kreeft | Kay Macquarrie
Proceedings of Machine Translation Summit XVIII: Research Track

In the media industry and the focus of global reporting can shift overnight. There is a compelling need to be able to develop new machine translation systems in a short period of time and in order to more efficiently cover quickly developing stories. As part of the EU project GoURMET and which focusses on low-resource machine translation and our media partners selected a surprise language for which a machine translation system had to be built and evaluated in two months(February and March 2021). The language selected was Pashto and an Indo-Iranian language spoken in Afghanistan and Pakistan and India. In this period we completed the full pipeline of development of a neural machine translation system: data crawling and cleaning and aligning and creating test sets and developing and testing models and and delivering them to the user partners. In this paperwe describe rapid data creation and experiments with transfer learning and pretraining for this low-resource language pair. We find that starting from an existing large model pre-trained on 50languages leads to far better BLEU scores than pretraining on one high-resource language pair with a smaller model. We also present human evaluation of our systems and which indicates that the resulting systems perform better than a freely available commercial system when translating from English into Pashto direction and and similarly when translating from Pashto into English.


pdf bib
Expand and Filter: CUNI and LMU Systems for the WNGT 2020 Duolingo Shared Task
Jindřich Libovický | Zdeněk Kasner | Jindřich Helcl | Ondřej Dušek
Proceedings of the Fourth Workshop on Neural Generation and Translation

We present our submission to the Simultaneous Translation And Paraphrase for Language Education (STAPLE) challenge. We used a standard Transformer model for translation, with a crosslingual classifier predicting correct translations on the output n-best list. To increase the diversity of the outputs, we used additional data to train the translation model, and we trained a paraphrasing model based on the Levenshtein Transformer architecture to generate further synonymous translations. The paraphrasing results were again filtered using our classifier. While the use of additional data and our classifier filter were able to improve results, the paraphrasing model produced too many invalid outputs to further improve the output quality. Our model without the paraphrasing component finished in the middle of the field for the shared task, improving over the best baseline by a margin of 10-22 % weighted F1 absolute.


pdf bib
CUNI System for the WMT19 Robustness Task
Jindřich Helcl | Jindřich Libovický | Martin Popel
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We present our submission to the WMT19 Robustness Task. Our baseline system is the Charles University (CUNI) Transformer system trained for the WMT18 shared task on News Translation. Quantitative results show that the CUNI Transformer system is already far more robust to noisy input than the LSTM-based baseline provided by the task organizers. We further improved the performance of our model by fine-tuning on the in-domain noisy data without influencing the translation quality on the news domain.


pdf bib
Neural Monkey: The Current State and Beyond
Jindřich Helcl | Jindřich Libovický | Tom Kocmi | Tomáš Musil | Ondřej Cífka | Dušan Variš | Ondřej Bojar
Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track)

pdf bib
Input Combination Strategies for Multi-Source Transformer Decoder
Jindřich Libovický | Jindřich Helcl | David Mareček
Proceedings of the Third Conference on Machine Translation: Research Papers

In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source baselines.

pdf bib
CUNI System for the WMT18 Multimodal Translation Task
Jindřich Helcl | Jindřich Libovický | Dušan Variš
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We present our submission to the WMT18 Multimodal Translation Task. The main feature of our submission is applying a self-attentive network instead of a recurrent neural network. We evaluate two methods of incorporating the visual features in the model: first, we include the image representation as another input to the network; second, we train the model to predict the visual features and use it as an auxiliary objective. For our submission, we acquired both textual and multimodal additional data. Both of the proposed methods yield significant improvements over recurrent networks and self-attentive textual baselines.

pdf bib
End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification
Jindřich Libovický | Jindřich Helcl
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Autoregressive decoding is the only part of sequence-to-sequence models that prevents them from massive parallelization at inference time. Non-autoregressive models enable the decoder to generate all output symbols independently in parallel. We present a novel non-autoregressive architecture based on connectionist temporal classification and evaluate it on the task of neural machine translation. Unlike other non-autoregressive methods which operate in several steps, our model can be trained end-to-end. We conduct experiments on the WMT English-Romanian and English-German datasets. Our models achieve a significant speedup over the autoregressive models, keeping the translation quality comparable to other non-autoregressive models.


pdf bib
Deep architectures for Neural Machine Translation
Antonio Valerio Miceli Barone | Jindřich Helcl | Rico Sennrich | Barry Haddow | Alexandra Birch
Proceedings of the Second Conference on Machine Translation

pdf bib
CUNI System for the WMT17 Multimodal Translation Task
Jindřich Helcl | Jindřich Libovický
Proceedings of the Second Conference on Machine Translation

pdf bib
Results of the WMT17 Neural MT Training Task
Ondřej Bojar | Jindřich Helcl | Tom Kocmi | Jindřich Libovický | Tomáš Musil
Proceedings of the Second Conference on Machine Translation

pdf bib
Attention Strategies for Multi-Source Sequence-to-Sequence Learning
Jindřich Libovický | Jindřich Helcl
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Modeling attention in neural multi-source sequence-to-sequence learning remains a relatively unexplored area, despite its usefulness in tasks that incorporate multiple source languages or modalities. We propose two novel approaches to combine the outputs of attention mechanisms over each source sequence, flat and hierarchical. We compare the proposed methods with existing techniques and present results of systematic evaluation of those methods on the WMT16 Multimodal Translation and Automatic Post-editing tasks. We show that the proposed methods achieve competitive results on both tasks.


pdf bib
CUNI System for WMT16 Automatic Post-Editing and Multimodal Translation Tasks
Jindřich Libovický | Jindřich Helcl | Marek Tlustý | Ondřej Bojar | Pavel Pecina
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf bib
Deeper Machine Translation and Evaluation for German
Eleftherios Avramidis | Vivien Macketanz | Aljoscha Burchardt | Jindrich Helcl | Hans Uszkoreit
Proceedings of the 2nd Deep Machine Translation Workshop

pdf bib
UFAL Submissions to the IWSLT 2016 MT Track
Ondřej Bojar | Ondřej Cífka | Jindřich Helcl | Tom Kocmi | Roman Sudarikov
Proceedings of the 13th International Conference on Spoken Language Translation

We present our submissions to the IWSLT 2016 machine translation task, as our first attempt to translate subtitles and one of our early experiments with neural machine translation (NMT). We focus primarily on English→Czech translation direction but perform also basic adaptation experiments for NMT with German and also the reverse direction. Three MT systems are tested: (1) our Chimera, a tight combination of phrase-based MT and deep linguistic processing, (2) Neural Monkey, our implementation of a NMT system in TensorFlow and (3) Nematus, an established NMT system.