David Talbot


2019

pdf bib
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita | David Talbot | Fedor Moiseev | Rico Sennrich | Ivan Titov
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Multi-head self-attention is a key component of the Transformer, a state-of-the-art architecture for neural machine translation. In this work we evaluate the contribution made by individual attention heads to the overall performance of the model and analyze the roles played by them in the encoder. We find that the most important and confident heads play consistent and often linguistically-interpretable roles. When pruning heads using a method based on stochastic gates and a differentiable relaxation of the L0 penalty, we observe that specialized heads are last to be pruned. Our novel pruning method removes the vast majority of heads without seriously affecting performance. For example, on the English-Russian WMT dataset, pruning 38 out of 48 encoder heads results in a drop of only 0.15 BLEU.

2011

pdf bib
Language-independent compound splitting with morphological operations
Klaus Macherey | Andrew Dai | David Talbot | Ashok Popat | Franz Och
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Lightweight Evaluation Framework for Machine Translation Reordering
David Talbot | Hideto Kazawa | Hiroshi Ichikawa | Jason Katz-Brown | Masakazu Seno | Franz Och
Proceedings of the Sixth Workshop on Statistical Machine Translation

pdf bib
Training a Parser for Machine Translation Reordering
Jason Katz-Brown | Slav Petrov | Ryan McDonald | Franz Och | David Talbot | Hiroshi Ichikawa | Masakazu Seno | Hideto Kazawa
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation.
Ashish Venugopal | Jakob Uszkoreit | David Talbot | Franz Och | Juri Ganitkevitch
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Randomized Language Models via Perfect Hash Functions
David Talbot | Thorsten Brants
Proceedings of ACL-08: HLT

2007

pdf bib
Randomised Language Modelling for Statistical Machine Translation
David Talbot | Miles Osborne
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Smoothed Bloom Filter Language Models: Tera-Scale LMs on the Cheap
David Talbot | Miles Osborne
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Modelling Lexical Redundancy for Machine Translation
David Talbot | Miles Osborne
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation
Philipp Koehn | Amittai Axelrod | Alexandra Birch Mayne | Chris Callison-Burch | Miles Osborne | David Talbot
Proceedings of the Second International Workshop on Spoken Language Translation

2004

pdf bib
Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora
Chris Callison-Burch | David Talbot | Miles Osborne
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)