Carlos Escolano


pdf bib
Pretrained Speech Encoders and Efficient Fine-tuning Methods for Speech Translation: UPC at IWSLT 2022
Ioannis Tsiamas | Gerard I. Gállego | Carlos Escolano | José Fonollosa | Marta R. Costa-jussà
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

This paper describes the submissions of the UPC Machine Translation group to the IWSLT 2022 Offline Speech Translation and Speech-to-Speech Translation tracks. The offline task involves translating English speech to German, Japanese and Chinese text. Our Speech Translation systems are trained end-to-end and are based on large pretrained speech and text models. We use an efficient fine-tuning technique that trains only specific layers of our system, and explore the use of adapter modules for the non-trainable layers. We further investigate the suitability of different speech encoders (wav2vec 2.0, HuBERT) for our models and the impact of knowledge distillation from the Machine Translation model that we use for the decoder (mBART). For segmenting the IWSLT test sets we fine-tune a pretrained audio segmentation model and achieve improvements of 5 BLEU compared to the given segmentation. Our best single model uses HuBERT and parallel adapters and achieves 29.42 BLEU at English-German MuST-C tst-COMMON and 26.77 at IWSLT 2020 test. By ensembling many models, we further increase translation quality to 30.83 BLEU and 27.78 accordingly. Furthermore, our submission for English-Japanese achieves 15.85 and English-Chinese obtains 25.63 BLEU on the MuST-C tst-COMMON sets. Finally, we extend our system to perform English-German Speech-to-Speech Translation with a pretrained Text-to-Speech model.


pdf bib
End-to-End Speech Translation with Pre-trained Models and Adapters: UPC at IWSLT 2021
Gerard I. Gállego | Ioannis Tsiamas | Carlos Escolano | José A. R. Fonollosa | Marta R. Costa-jussà
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)

This paper describes the submission to the IWSLT 2021 offline speech translation task by the UPC Machine Translation group. The task consists of building a system capable of translating English audio recordings extracted from TED talks into German text. Submitted systems can be either cascade or end-to-end and use a custom or given segmentation. Our submission is an end-to-end speech translation system, which combines pre-trained models (Wav2Vec 2.0 and mBART) with coupling modules between the encoder and decoder, and uses an efficient fine-tuning technique, which trains only 20% of its total parameters. We show that adding an Adapter to the system and pre-training it, can increase the convergence speed and the final result, with which we achieve a BLEU score of 27.3 on the MuST-C test set. Our final model is an ensemble that obtains 28.22 BLEU score on the same set. Our submission also uses a custom segmentation algorithm that employs pre-trained Wav2Vec 2.0 for identifying periods of untranscribable text and can bring improvements of 2.5 to 3 BLEU score on the IWSLT 2019 test set, as compared to the result with the given segmentation.

pdf bib
The TALP-UPC Participation in WMT21 News Translation Task: an mBART-based NMT Approach
Carlos Escolano | Ioannis Tsiamas | Christine Basta | Javier Ferrando | Marta R. Costa-jussa | José A. R. Fonollosa
Proceedings of the Sixth Conference on Machine Translation

This paper describes the submission to the WMT 2021 news translation shared task by the UPC Machine Translation group. The goal of the task is to translate German to French (De-Fr) and French to German (Fr-De). Our submission focuses on fine-tuning a pre-trained model to take advantage of monolingual data. We fine-tune mBART50 using the filtered data, and additionally, we train a Transformer model on the same data from scratch. In the experiments, we show that fine-tuning mBART50 results in 31.69 BLEU for De-Fr and 23.63 BLEU for Fr-De, which increases 2.71 and 1.90 BLEU accordingly, as compared to the model we train from scratch. Our final submission is an ensemble of these two models, further increasing 0.3 BLEU for Fr-De.

pdf bib
Multilingual Machine Translation: Closing the Gap between Shared and Language-specific Encoder-Decoders
Carlos Escolano | Marta R. Costa-jussà | José A. R. Fonollosa | Mikel Artetxe
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

State-of-the-art multilingual machine translation relies on a universal encoder-decoder, which requires retraining the entire system to add new languages. In this paper, we propose an alternative approach that is based on language-specific encoder-decoders, and can thus be more easily extended to new languages by learning their corresponding modules. So as to encourage a common interlingua representation, we simultaneously train the N initial languages. Our experiments show that the proposed approach outperforms the universal encoder-decoder by 3.28 BLEU points on average, while allowing to add new languages without the need to retrain the rest of the modules. All in all, our work closes the gap between shared and language-specific encoderdecoders, advancing toward modular multilingual machine translation systems that can be flexibly extended in lifelong learning settings.

pdf bib
Enriching the Transformer with Linguistic Factors for Low-Resource Machine Translation
Jordi Armengol-Estapé | Marta R. Costa-jussà | Carlos Escolano
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Introducing factors, that is to say, word features such as linguistic information referring to the source tokens, is known to improve the results of neural machine translation systems in certain settings, typically in recurrent architectures. This study proposes enhancing the current state-of-the-art neural machine translation architecture, the Transformer, so that it allows to introduce external knowledge. In particular, our proposed modification, the Factored Transformer, uses linguistic factors that insert additional knowledge into the machine translation system. Apart from using different kinds of features, we study the effect of different architectural configurations. Specifically, we analyze the performance of combining words and features at the embedding level or at the encoder level, and we experiment with two different combination strategies. With the best-found configuration, we show improvements of 0.8 BLEU over the baseline Transformer in the IWSLT German-to-English task. Moreover, we experiment with the more challenging FLoRes English-to-Nepali benchmark, which includes both extremely low-resourced and very distant languages, and obtain an improvement of 1.2 BLEU


pdf bib
The TALP-UPC System Description for WMT20 News Translation Task: Multilingual Adaptation for Low Resource MT
Carlos Escolano | Marta R. Costa-jussà | José A. R. Fonollosa
Proceedings of the Fifth Conference on Machine Translation

In this article, we describe the TALP-UPC participation in the WMT20 news translation shared task for Tamil-English. Given the low amount of parallel training data, we resort to adapt the task to a multilingual system to benefit from the positive transfer from high resource languages. We use iterative backtranslation to fine-tune the system and benefit from the monolingual data available. In order to measure the effectivity of such methods, we compare our results to a bilingual baseline system.


pdf bib
The TALP-UPC Machine Translation Systems for WMT19 News Translation Task: Pivoting Techniques for Low Resource MT
Noe Casas | José A. R. Fonollosa | Carlos Escolano | Christine Basta | Marta R. Costa-jussà
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

In this article, we describe the TALP-UPC research group participation in the WMT19 news translation shared task for Kazakh-English. Given the low amount of parallel training data, we resort to using Russian as pivot language, training subword-based statistical translation systems for Russian-Kazakh and Russian-English that were then used to create two synthetic pseudo-parallel corpora for Kazakh-English and English-Kazakh respectively. Finally, a self-attention model based on the decoder part of the Transformer architecture was trained on the two pseudo-parallel corpora.

pdf bib
Multilingual, Multi-scale and Multi-layer Visualization of Intermediate Representations
Carlos Escolano | Marta R. Costa-jussà | Elora Lacroux | Pere-Pau Vázquez
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

The main alternatives nowadays to deal with sequences are Recurrent Neural Networks (RNN) architectures and the Transformer. In this context, Both RNN’s and Transformer have been used as an encoder-decoder architecture with multiple layers in each module. Far beyond this, these architectures are the basis for the contextual word embeddings which are revolutionizing most natural language downstream applications. However, intermediate representations in either the RNN or Transformer architectures can be difficult to interpret. To make these layer representations more accessible and meaningful, we introduce a web-based tool that visualizes them both at the sentence and token level. We present three use cases. The first analyses gender issues in contextual word embeddings. The second and third are showing multilingual intermediate representations for sentences and tokens and the evolution of these intermediate representations along with the multiple layers of the decoder and in the context of multilingual machine translation.

pdf bib
From Bilingual to Multilingual Neural Machine Translation by Incremental Training
Carlos Escolano | Marta R. Costa-jussà | José A. R. Fonollosa
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Multilingual Neural Machine Translation approaches are based on the use of task specific models and the addition of one more language can only be done by retraining the whole system. In this work, we propose a new training schedule that allows the system to scale to more languages without modification of the previous components based on joint training and language-independent encoder/decoder modules allowing for zero-shot translation. This work in progress shows close results to state-of-the-art in the WMT task.


pdf bib
The TALP-UPC Machine Translation Systems for WMT18 News Shared Translation Task
Noe Casas | Carlos Escolano | Marta R. Costa-jussà | José A. R. Fonollosa
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

In this article we describe the TALP-UPC research group participation in the WMT18 news shared translation task for Finnish-English and Estonian-English within the multi-lingual subtrack. All of our primary submissions implement an attention-based Neural Machine Translation architecture. Given that Finnish and Estonian belong to the same language family and are similar, we use as training data the combination of the datasets of both language pairs to paliate the data scarceness of each individual pair. We also report the translation quality of systems trained on individual language pair data to serve as baseline and comparison reference.


pdf bib
Byte-based Neural Machine Translation
Marta R. Costa-jussà | Carlos Escolano | José A. R. Fonollosa
Proceedings of the First Workshop on Subword and Character Level Models in NLP

This paper presents experiments comparing character-based and byte-based neural machine translation systems. The main motivation of the byte-based neural machine translation system is to build multi-lingual neural machine translation systems that can share the same vocabulary. We compare the performance of both systems in several language pairs and we see that the performance in test is similar for most language pairs while the training time is slightly reduced in the case of byte-based neural machine translation.

pdf bib
The TALP-UPC Neural Machine Translation System for German/Finnish-English Using the Inverse Direction Model in Rescoring
Carlos Escolano | Marta R. Costa-jussà | José A. R. Fonollosa
Proceedings of the Second Conference on Machine Translation


pdf bib
The TALPUPC Spanish–English WMT Biomedical Task: Bilingual Embeddings and Char-based Neural Language Model Rescoring in a Phrase-based System
Marta R. Costa-jussà | Cristina España-Bonet | Pranava Madhyastha | Carlos Escolano | José A. R. Fonollosa
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers