Machine Translation Marathon 2012

September 3-8, Edinburgh, UK


Program

Talks and lectures

Monday, 3rd September

Lecture: Introduction to Statistical Machine Translation
Chris Dyer

TrTok: A Fast and Trainable Tokenizer for Natural Languages
Jiří Maršik and Ondřej Bojar

High-Precision Sentence Alignment by Bootstrapping from Wood Standard Annotations [not available]

Eva Mújdricza-Maydt, Huiqin Körkel-Qu, Stefan Riezler and Sebastian Padó

Invited Talk: Discourse and SMT: Where and How?
Bonnie Webber

Labs: Moses and the Experiment Management System [http://www.statmt.org/mtm12/index.php%3Fn=Main.MosesLab]
Barry Haddow and Philipp Koehn

 

Tuesday, 4th September

Lecture:  Word-based Models
Colin Cherry

Lecture: Phrase-based Models
Hieu Hoang

Phrasal Rank-Encoding:   Exploiting Phrase Redundancy and Translational Relations for Phrase Table Compression
Marcin Junczys-Dowmunt

Parallel Phrase Scoring for Extra-large Corpora
Mohammed Mediani, Jan Niehues and Alex Waibel

Invited Talk: Large Scale Parallel Data-mining for Google Translate [not available]
Arne Mauser

Labs: Alignment [not available]

Marcello Federico, Colin Cherry and Dave Matthews

 

Wednesday, 5th September

Lecture: Decoding for Phrase-based Models
Colin Cherry

Lecture: Language Modelling
Marcello Federico

pycdec: A Python Interface to cdec
Victor Chahuneau, Noah A. Smith and Chris Dyer

Better Splitting Algorithms for Parallel Corpus Processing
Lane Schwartz

Invited Talk: Translation Process Research and the CRITT TPR Database
Michael Carl

 

Thursday, 6th September

Lecture: Hierarchical and Syntactic Models
Phil Blunsom

Lecture: Chart-based Decoding
Kenneth Heafield

Hierarchical Phrase-Based Translation with Jane 2
Matthias Huck, Jan-Thorsten Peter, Markus Freitag, Stephan Peitz and Hermann Ney [presentation]

Extending Hiero Decoding in Moses with Cube Growing
Wenduan Xu and Philipp Koehn

Invited Talk:   MT R&D in Academia and Industry: Observations from the Trenches

Andy Way

 

Labs: Decoding [not available]

 

Discussion: The Future of Open-Source Machine Translation [not available]

Marcello Federico

 

Friday, 7th September

Lecture: Discriminative Training
Chris Dyer

Lecture: Computer Aided Translation
Philipp Koehn

Appraise: an Open-Source Toolkit for Manual Evaluation of MT Output

Christian Federmann  [presentation]

DELiC4MT: A Tool for Diagnostic MT Evaluation over User-defined Linguistic Phenomena
Antonio Toral, Sudip Kumar Naskar, Federico Gaspari and Declan Groves

Invited Talk: Quality Estimation for MT: State of the Art and Challenges
Lucia Specia

Projects

Diagnostic evaluation of MT with DELiC4MT

Walid Aransa, Luong Ngoc Quang, & Antonio Toral

 

Document-level decoding in Moses

Nicola Bertoldi, Robert Grabowski, Liane Guillou, Michal Novak, Sorin Slavescu, Jose de Souza

 

Multiple reference translations for European languages

Christian Buck, Daniel Zeman, Eva Hasler

 

Sparse features in Moses

Colin Cherry, Barry Haddow

 

Building Moses training pipelines with Arrows

Jie Jiang, David Kolovratnik, Ian Johnson

 

Sparse features in Joshua

Matt Post, Juri Ganitkevich

 

Bounded-memory language model building

Ivan Pouzvrevsky, Mohammed Mediani, Kenneth Heafield

 

Parallel corpus extraction from CommonCrawl

Hervé Saint-Amand, Jason Smith, Magdalena Plamada

 

New development functionality for the Asiya Suite parameter optimization with Mert

Meritxell Gonzŕlez, Cristina Espańa-Bonet: