Graeme Blackwood

2023

Synthetic Pre-Training Tasks for Neural Machine Translation
Zexue He | Graeme Blackwood | Rameswar Panda | Julian McAuley | Rogerio Feris
Findings of the Association for Computational Linguistics: ACL 2023

Pre-training models with large crawled corpora can lead to issues such as toxicity and bias, as well as copyright and privacy concerns. A promising way of alleviating such concerns is to conduct pre-training with synthetic tasks and data, since no real-world information is ingested by the model. Our goal in this paper is to understand the factors that contribute to the effectiveness of pre-training models when using synthetic resources, particularly in the context of neural machine translation. We propose several novel approaches to pre-training translation models that involve different levels of lexical and structural knowledge, including: 1) generating obfuscated data from a large parallel corpus 2) concatenating phrase pairs extracted from a small word-aligned corpus, and 3) generating synthetic parallel data without real human language corpora. Our experiments on multiple language pairs reveal that pre-training benefits can be realized even with high levels of obfuscation or purely synthetic parallel data. We hope the findings from our comprehensive empirical analysis will shed light on understanding what matters for NMT pre-training, as well as pave the way for the development of more efficient and less toxic models.

2018

pdf bib abs

Multilingual Neural Machine Translation with Task-Specific Attention
Graeme Blackwood | Miguel Ballesteros | Todd Ward
Proceedings of the 27th International Conference on Computational Linguistics

Multilingual machine translation addresses the task of translating between multiple source and target languages. We propose task-specific attention models, a simple but effective technique for improving the quality of sequence-to-sequence neural multilingual translation. Our approach seeks to retain as much of the parameter sharing generalization of NMT models as possible, while still allowing for language-specific specialization of the attention model to a particular language-pair or task. Our experiments on four languages of the Europarl corpus show that using a target-specific model of attention provides consistent gains in translation quality for all possible translation directions, compared to a model in which all parameters are shared. We observe improved translation quality even in the (extreme) low-resource zero-shot translation directions for which the model never saw explicitly paired parallel data.

2015

pdf bib

A Coarse-Grained Model for Optimal Coupling of ASR and SMT Systems for Speech Translation
Gaurav Kumar | Graeme Blackwood | Jan Trmal | Daniel Povey | Sanjeev Khudanpur
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib abs

Automatic dialect classification for statistical machine translation
Saab Mansour | Yaser Al-Onaizan | Graeme Blackwood | Christoph Tillmann
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

The training data for statistical machine translation are gathered from various sources representing a mixture of domains. In this work, we argue that when translating dialects representing varieties of the same language, a manually assigned data source is not a reliable indicator of the dialect. We resort to automatic dialect classification to refine the training corpora according to the different dialects and build improved dialect specific systems. A fairly standard classifier for Arabic developed within this work achieves state-of-the-art performance, with classification precision above 90%, making it usefully accurate for our application. The classification of the data is then used to distinguish between the different dialects, split the data accordingly, and utilize the new splits for several adaptation techniques. Performing translation experiments on a large scale dialectal Arabic to English translation task, our results show that the classifier generates better contrast between the dialects and achieves superior translation quality than using the original manual corpora splits.

2012

pdf bib

Syntax-Based Word Ordering Incorporating a Large-Scale Language Model
Yue Zhang | Graeme Blackwood | Stephen Clark
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib

Lattice-Based Minimum Error Rate Training Using Weighted Finite-State Transducers with Tropical Polynomial Weights
Aurelien Waite | Graeme Blackwood | William Byrne
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing

2010

pdf bib

pdf bib

Efficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices
Graeme Blackwood | Adrià de Gispert | William Byrne
Proceedings of the ACL 2010 Conference Short Papers

pdf bib

Hierarchical Phrase-Based Translation with Weighted Finite-State Transducers and Shallow-n Grammars
Adrià de Gispert | Gonzalo Iglesias | Graeme Blackwood | Eduardo R. Banga | William Byrne
Computational Linguistics, Volume 36, Issue 3 - September 2010

pdf bib

Fluency Constraints for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices
Graeme Blackwood | Adrià de Gispert | William Byrne
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2008

pdf bib

European Language Translation with Weighted Finite State Transducers: The CUED MT System for the 2008 ACL Workshop on SMT
Graeme Blackwood | Adrià de Gispert | Jamie Brunning | William Byrne
Proceedings of the Third Workshop on Statistical Machine Translation

pdf bib

Phrasal Segmentation Models for Statistical Machine Translation
Graeme Blackwood | Adrià de Gispert | William Byrne
Coling 2008: Companion volume: Posters

Co-authors

Venues