Xiangkai Zeng
2019
Empirical Evaluation of Active Learning Techniques for Neural MT
Xiangkai Zeng
|
Sarthak Garg
|
Rajen Chatterjee
|
Udhyakumar Nallasamy
|
Matthias Paulik
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Active learning (AL) for machine translation (MT) has been well-studied for the phrase-based MT paradigm. Several AL algorithms for data sampling have been proposed over the years. However, given the rapid advancement in neural methods, these algorithms have not been thoroughly investigated in the context of neural MT (NMT). In this work, we address this missing aspect by conducting a systematic comparison of different AL methods in a simulated AL framework. Our experimental setup to compare different AL methods uses: i) State-of-the-art NMT architecture to achieve realistic results; and ii) the same dataset (WMT’13 English-Spanish) to have fair comparison across different methods. We then demonstrate how recent advancements in unsupervised pre-training and paraphrastic embedding can be used to improve existing AL methods. Finally, we propose a neural extension for an AL sampling method used in the context of phrase-based MT - Round Trip Translation Likelihood (RTTL). RTTL uses a bidirectional translation model to estimate the loss of information during translation and outperforms previous methods.
Improving Robustness of Neural Machine Translation with Multi-task Learning
Shuyan Zhou
|
Xiangkai Zeng
|
Yingqi Zhou
|
Antonios Anastasopoulos
|
Graham Neubig
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
While neural machine translation (NMT) achieves remarkable performance on clean, in-domain text, performance is known to degrade drastically when facing text which is full of typos, grammatical errors and other varieties of noise. In this work, we propose a multi-task learning algorithm for transformer-based MT systems that is more resilient to this noise. We describe our submission to the WMT 2019 Robustness shared task based on this method. Our model achieves a BLEU score of 32.8 on the shared task French to English dataset, which is 7.1 BLEU points higher than the baseline vanilla transformer trained with clean text.
Search
Fix data
Co-authors
- Antonios Anastasopoulos 1
- Rajen Chatterjee 1
- Sarthak Garg 1
- Udhyakumar Nallasamy 1
- Graham Neubig 1
- show all...