Antonio Valerio Miceli-Barone

Also published as: Antonio Valerio Miceli Barone, Antonio Valerio Miceli Barone, Antonio Valerio Miceli Barone


2022

pdf bib
Survey of Low-Resource Machine Translation
Barry Haddow | Rachel Bawden | Antonio Valerio Miceli Barone | Jindřich Helcl | Alexandra Birch
Computational Linguistics, Volume 48, Issue 3 - September 2022

We present a survey covering the state of the art in low-resource machine translation (MT) research. There are currently around 7,000 languages spoken in the world and almost all language pairs lack significant resources for training machine translation models. There has been increasing interest in research addressing the challenge of producing useful translation models when very little translated training data is available. We present a summary of this topical research field and provide a description of the techniques evaluated by researchers in several recent shared tasks in low-resource MT.

pdf bib
Distributionally Robust Recurrent Decoders with Random Network Distillation
Antonio Valerio Miceli Barone | Alexandra Birch | Rico Sennrich
Proceedings of the 7th Workshop on Representation Learning for NLP

Neural machine learning models can successfully model language that is similar to their training distribution, but they are highly susceptible to degradation under distribution shift, which occurs in many practical applications when processing out-of-domain (OOD) text. This has been attributed to “shortcut learning”":" relying on weak correlations over arbitrary large contexts. We propose a method based on OOD detection with Random Network Distillation to allow an autoregressive language model to automatically disregard OOD context during inference, smoothly transitioning towards a less expressive but more robust model as the data becomes more OOD, while retaining its full context capability when operating in-distribution. We apply our method to a GRU architecture, demonstrating improvements on multiple language modeling (LM) datasets.

2021

pdf bib
The University of Edinburgh’s English-German and English-Hausa Submissions to the WMT21 News Translation Task
Pinzhen Chen | Jindřich Helcl | Ulrich Germann | Laurie Burchell | Nikolay Bogoychev | Antonio Valerio Miceli Barone | Jonas Waldendorf | Alexandra Birch | Kenneth Heafield
Proceedings of the Sixth Conference on Machine Translation

This paper presents the University of Edinburgh’s constrained submissions of English-German and English-Hausa systems to the WMT 2021 shared task on news translation. We build En-De systems in three stages: corpus filtering, back-translation, and fine-tuning. For En-Ha we use an iterative back-translation approach on top of pre-trained En-De models and investigate vocabulary embedding mapping.

pdf bib
Surprise Language Challenge: Developing a Neural Machine Translation System between Pashto and English in Two Months
Alexandra Birch | Barry Haddow | Antonio Valerio Miceli Barone | Jindrich Helcl | Jonas Waldendorf | Felipe Sánchez Martínez | Mikel Forcada | Víctor Sánchez Cartagena | Juan Antonio Pérez-Ortiz | Miquel Esplà-Gomis | Wilker Aziz | Lina Murady | Sevi Sariisik | Peggy van der Kreeft | Kay Macquarrie
Proceedings of Machine Translation Summit XVIII: Research Track

In the media industry and the focus of global reporting can shift overnight. There is a compelling need to be able to develop new machine translation systems in a short period of time and in order to more efficiently cover quickly developing stories. As part of the EU project GoURMET and which focusses on low-resource machine translation and our media partners selected a surprise language for which a machine translation system had to be built and evaluated in two months(February and March 2021). The language selected was Pashto and an Indo-Iranian language spoken in Afghanistan and Pakistan and India. In this period we completed the full pipeline of development of a neural machine translation system: data crawling and cleaning and aligning and creating test sets and developing and testing models and and delivering them to the user partners. In this paperwe describe rapid data creation and experiments with transfer learning and pretraining for this low-resource language pair. We find that starting from an existing large model pre-trained on 50languages leads to far better BLEU scores than pretraining on one high-resource language pair with a smaller model. We also present human evaluation of our systems and which indicates that the resulting systems perform better than a freely available commercial system when translating from English into Pashto direction and and similarly when translating from Pashto into English.

2020

pdf bib
The University of Edinburgh’s English-Tamil and English-Inuktitut Submissions to the WMT20 News Translation Task
Rachel Bawden | Alexandra Birch | Radina Dobreva | Arturo Oncevay | Antonio Valerio Miceli Barone | Philip Williams
Proceedings of the Fifth Conference on Machine Translation

We describe the University of Edinburgh’s submissions to the WMT20 news translation shared task for the low resource language pair English-Tamil and the mid-resource language pair English-Inuktitut. We use the neural machine translation transformer architecture for all submissions and explore a variety of techniques to improve translation quality to compensate for the lack of parallel training data. For the very low-resource English-Tamil, this involves exploring pretraining, using both language model objectives and translation using an unrelated high-resource language pair (German-English), and iterative backtranslation. For English-Inuktitut, we explore the use of multilingual systems, which, despite not being part of the primary submission, would have achieved the best results on the test set.

2019

pdf bib
The University of Edinburgh’s Submissions to the WMT19 News Translation Task
Rachel Bawden | Nikolay Bogoychev | Ulrich Germann | Roman Grundkiewicz | Faheem Kirefu | Antonio Valerio Miceli Barone | Alexandra Birch
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

The University of Edinburgh participated in the WMT19 Shared Task on News Translation in six language directions: English↔Gujarati, English↔Chinese, German→English, and English→Czech. For all translation directions, we created or used back-translations of monolingual data in the target language as additional synthetic training data. For English↔Gujarati, we also explored semi-supervised MT with cross-lingual language model pre-training, and translation pivoting through Hindi. For translation to and from Chinese, we investigated character-based tokenisation vs. sub-word segmentation of Chinese text. For German→English, we studied the impact of vast amounts of back-translated training data on translation quality, gaining a few additional insights over Edunov et al. (2018). For English→Czech, we compared different preprocessing and tokenisation regimes.

pdf bib
Global Under-Resourced Media Translation (GoURMET)
Alexandra Birch | Barry Haddow | Ivan Tito | Antonio Valerio Miceli Barone | Rachel Bawden | Felipe Sánchez-Martínez | Mikel L. Forcada | Miquel Esplà-Gomis | Víctor Sánchez-Cartagena | Juan Antonio Pérez-Ortiz | Wilker Aziz | Andrew Secker | Peggy van der Kreeft
Proceedings of Machine Translation Summit XVII: Translator, Project and User Tracks

2018

pdf bib
Low-rank passthrough neural networks
Antonio Valerio Miceli Barone
Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP

Various common deep learning architectures, such as LSTMs, GRUs, Resnets and Highway Networks, employ state passthrough connections that support training with high feed-forward depth or recurrence over many time steps. These “Passthrough Networks” architectures also enable the decoupling of the network state size from the number of parameters of the network, a possibility has been studied by Sak et al. (2014) with their low-rank parametrization of the LSTM. In this work we extend this line of research, proposing effective, low-rank and low-rank plus diagonal matrix parametrizations for Passthrough Networks which exploit this decoupling property, reducing the data complexity and memory requirements of the network while preserving its memory capacity. This is particularly beneficial in low-resource settings as it supports expressive models with a compact parametrization less susceptible to overfitting. We present competitive experimental results on several tasks, including language modeling and a near state of the art result on sequential randomly-permuted MNIST classification, a hard task on natural data.

pdf bib
The University of Edinburgh’s Submissions to the WMT18 News Translation Task
Barry Haddow | Nikolay Bogoychev | Denis Emelin | Ulrich Germann | Roman Grundkiewicz | Kenneth Heafield | Antonio Valerio Miceli Barone | Rico Sennrich
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

The University of Edinburgh made submissions to all 14 language pairs in the news translation task, with strong performances in most pairs. We introduce new RNN-variant, mixed RNN/Transformer ensembles, data selection and weighting, and extensions to back-translation.

pdf bib
Improving Machine Translation of Educational Content via Crowdsourcing
Maximiliana Behnke | Antonio Valerio Miceli Barone | Rico Sennrich | Vilelmini Sosoni | Thanasis Naskos | Eirini Takoulidou | Maria Stasimioti | Menno van Zaanen | Sheila Castilho | Federico Gaspari | Panayota Georgakopoulou | Valia Kordoni | Markus Egg | Katia Lida Kermanidis
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
A Comparative Quality Evaluation of PBSMT and NMT using Professional Translators
Sheila Castilho | Joss Moorkens | Federico Gaspari | Rico Sennrich | Vilelmini Sosoni | Panayota Georgakopoulou | Pintu Lohar | Andy Way | Antonio Valerio Miceli-Barone | Maria Gialama
Proceedings of Machine Translation Summit XVI: Research Track

pdf bib
A Parallel Corpus of Python Functions and Documentation Strings for Automated Code Documentation and Code Generation
Antonio Valerio Miceli Barone | Rico Sennrich
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Automated documentation of programming source code and automated code generation from natural language are challenging tasks of both practical and scientific interest. Progress in these areas has been limited by the low availability of parallel corpora of code and natural language descriptions, which tend to be small and constrained to specific domains. In this work we introduce a large and diverse parallel corpus of a hundred thousands Python functions with their documentation strings (“docstrings”) generated by scraping open source repositories on GitHub. We describe baseline results for the code documentation and code generation tasks obtained by neural machine translation. We also experiment with data augmentation techniques to further increase the amount of training data. We release our datasets and processing scripts in order to stimulate research in these areas.

pdf bib
Nematus: a Toolkit for Neural Machine Translation
Rico Sennrich | Orhan Firat | Kyunghyun Cho | Alexandra Birch | Barry Haddow | Julian Hitschler | Marcin Junczys-Dowmunt | Samuel Läubli | Antonio Valerio Miceli Barone | Jozef Mokry | Maria Nădejde
Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics

We present Nematus, a toolkit for Neural Machine Translation. The toolkit prioritizes high translation accuracy, usability, and extensibility. Nematus has been used to build top-performing submissions to shared translation tasks at WMT and IWSLT, and has been used to train systems for production environments.

pdf bib
Deep architectures for Neural Machine Translation
Antonio Valerio Miceli Barone | Jindřich Helcl | Rico Sennrich | Barry Haddow | Alexandra Birch
Proceedings of the Second Conference on Machine Translation

pdf bib
Copied Monolingual Data Improves Low-Resource Neural Machine Translation
Anna Currey | Antonio Valerio Miceli Barone | Kenneth Heafield
Proceedings of the Second Conference on Machine Translation

pdf bib
The University of Edinburgh’s Neural MT Systems for WMT17
Rico Sennrich | Alexandra Birch | Anna Currey | Ulrich Germann | Barry Haddow | Kenneth Heafield | Antonio Valerio Miceli Barone | Philip Williams
Proceedings of the Second Conference on Machine Translation

pdf bib
Regularization techniques for fine-tuning in neural machine translation
Antonio Valerio Miceli Barone | Barry Haddow | Ulrich Germann | Rico Sennrich
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We investigate techniques for supervised domain adaptation for neural machine translation where an existing model trained on a large out-of-domain dataset is adapted to a small in-domain dataset. In this scenario, overfitting is a major challenge. We investigate a number of techniques to reduce overfitting and improve transfer learning, including regularization techniques such as dropout and L2-regularization towards an out-of-domain prior. In addition, we introduce tuneout, a novel regularization technique inspired by dropout. We apply these techniques, alone and in combination, to neural machine translation, obtaining improvements on IWSLT datasets for English→German and English→Russian. We also investigate the amounts of in-domain training data needed for domain adaptation in NMT, and find a logarithmic relationship between the amount of training data and gain in BLEU score.

2016

pdf bib
Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders
Antonio Valerio Miceli Barone
Proceedings of the 1st Workshop on Representation Learning for NLP

2015

pdf bib
Non-projective Dependency-based Pre-Reordering with Recurrent Neural Network for Machine Translation
Antonio Valerio Miceli-Barone | Giuseppe Attardi
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Non-projective Dependency-based Pre-Reordering with Recurrent Neural Network for Machine Translation
Antonio Valerio Miceli-Barone | Giuseppe Attardi
Proceedings of the Ninth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
Translation reranking using source phrase dependency features
Antonio Valerio Miceli-Barone
Proceedings of the Ninth Workshop on Syntax, Semantics and Structure in Statistical Translation

2013

pdf bib
Pre-Reordering for Machine Translation Using Transition-Based Walks on Dependency Parse Trees
Antonio Valerio Miceli-Barone | Giuseppe Attardi
Proceedings of the Eighth Workshop on Statistical Machine Translation

2012

pdf bib
Dependency Parsing Domain Adaptation using Transductive SVM
Antonio Valerio Miceli-Barone | Giuseppe Attardi
Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP

2011

pdf bib
A Dependency Based Statistical Translation Model
Giuseppe Attardi | Atanas Chanev | Antonio Valerio Miceli Barone
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation