This paper presents an automatic speech translation system aimed at live subtitling of conference presentations. We describe the overall architecture and key processing components. More importantly, we explain our strategy for building a complex system for end-users from numerous individual components, each of which has been tested only in laboratory conditions. The system is a working prototype that is routinely tested in recognizing English, Czech, and German speech and presenting it translated simultaneously into 42 target languages.
This paper is an ELITR system submission for the non-native speech translation task at IWSLT 2020. We describe systems for offline ASR, real-time ASR, and our cascaded approach to offline SLT and real-time SLT. We select our primary candidates from a pool of pre-existing systems, develop a new end-to-end general ASR system, and a hybrid ASR trained on non-native speech. The provided small validation set prevents us from carrying out a complex validation, but we submit all the unselected candidates for contrastive evaluation on the test set.
This paper presents our progress towards deploying a versatile communication platform in the task of highly multilingual live speech translation for conferences and remote meetings live subtitling. The platform has been designed with a focus on very low latency and high flexibility while allowing research prototypes of speech and text processing tools to be easily connected, regardless of where they physically run. We outline our architecture solution and also briefly compare it with the ELG platform. Technical details are provided on the most important components and we summarize the test deployment events we ran so far.
We describe the University of Edinburgh’s submissions to the WMT20 news translation shared task for the low resource language pair English-Tamil and the mid-resource language pair English-Inuktitut. We use the neural machine translation transformer architecture for all submissions and explore a variety of techniques to improve translation quality to compensate for the lack of parallel training data. For the very low-resource English-Tamil, this involves exploring pretraining, using both language model objectives and translation using an unrelated high-resource language pair (German-English), and iterative backtranslation. For English-Inuktitut, we explore the use of multilingual systems, which, despite not being part of the primary submission, would have achieved the best results on the test set.
ELITR (European Live Translator) project aims to create a speech translation system for simultaneous subtitling of conferences and online meetings targetting up to 43 languages. The technology is tested by the Supreme Audit Office of the Czech Republic and by alfaview®, a German online conferencing system. Other project goals are to advance document-level and multilingual machine translation, automatic speech recognition, and automatic minuting.
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations. In this paper, we explore ways to improve them. We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics, and overcome this bottleneck via language-specific components and deepening NMT architectures. We identify the off-target translation issue (i.e. translating into a wrong target language) as the major source of the inferior zero-shot performance, and propose random online backtranslation to enforce the translation of unseen training language pairs. Experiments on OPUS-100 (a novel multilingual dataset with 100 languages) show that our approach substantially narrows the performance gap with bilingual models in both one-to-many and many-to-many settings, and improves zero-shot performance by ~10 BLEU, approaching conventional pivot-based methods.
This paper describes the joint submission to the IWSLT 2019 English to Czech task by Samsung RD Institute, Poland, and the University of Edinburgh. Our submission was ultimately produced by combining four Transformer systems through a mixture of ensembling and reranking.
This paper describes the joint submission to the IWSLT 2018 Low Resource MT task by Samsung R&D Institute, Poland, and the University of Edinburgh. We focused on supplementing the very limited in-domain Basque-English training data with out-of-domain data, with synthetic data, and with data for other language pairs. We also experimented with a variety of model architectures and features, which included the development of extensions to the Nematus toolkit. Our submission was ultimately produced by a system combination in which we reranked translations from our strongest individual system using multiple weaker systems.
The tutorial explains in detail syntax-based statistical machine translation with synchronous context free grammars (SCFG). It is aimed at researchers who have little background in this area, and gives a comprehensive overview about the main models and methods.While syntax-based models in statistical machine translation have a long history, spanning back almost 20 years, they have only recently shown superior translation quality over the more commonly used phrase-based models, and are now considered state of the art for some language pairs, such as Chinese-English (since ISI's submission to NIST 2006), and English-German (since Edinburgh's submission to WMT 2012).While the field is very dynamic, there is a core set of methods that have become dominant. Such SCFG models are implemented in the open source machine translation toolkit Moses, and the tutors draw from the practical experience of its development.The tutorial focuses on explaining core established concepts in SCFG-based approaches, which are the most popular in this area. The main goal of the tutorial is for the audience to understand how these systems work end-to-end. We review as much relevant literature as necessary, but the tutorial is not a primarily research survey.The tutorial is rounded up with open problems and advanced topics, such as computational challenges, different formalisms for syntax-based models and inclusion of semantics.