Andreas Zollmann


2011

pdf bib
A Word-Class Approach to Labeling PSCFG Rules for Machine Translation
Andreas Zollmann | Stephan Vogel
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
New Parameterizations and Features for PSCFG-Based Machine Translation
Andreas Zollmann | Stephan Vogel
Proceedings of the 4th Workshop on Syntax and Structure in Statistical Translation

2009

pdf bib
Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation
Ashish Venugopal | Andreas Zollmann | Noah A. Smith | Stephan Vogel
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2008

pdf bib
A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT
Andreas Zollmann | Ashish Venugopal | Franz Och | Jay Ponte
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments.
Andreas Zollmann | Ashish Venugopal | Stephan Vogel
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

We present the CMU Syntax Augmented Machine Translation System that was used in the IWSLT-08 evaluation campaign. We participated in the Full-BTEC data track for Chinese-English translation, focusing on transcript translation. For this year’s evaluation, we ported the Syntax Augmented MT toolkit [1] to the Hadoop MapReduce [2] parallel processing architecture, allowing us to efficiently run experiments evaluating a novel “wider pipelines” approach to integrate evidence from N -best alignments into our translation models. We describe each step of the MapReduce pipeline as it is implemented in the open-source SAMT toolkit, and show improvements in translation quality by using N-best alignments in both hierarchical and syntax augmented translation systems.

pdf bib
Wider Pipelines: N-Best Alignments and Parses in MT Training
Ashish Venugopal | Andreas Zollmann | Noah A. Smith | Stephan Vogel
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers

State-of-the-art statistical machine translation systems use hypotheses from several maximum a posteriori inference steps, including word alignments and parse trees, to identify translational structure and estimate the parameters of translation models. While this approach leads to a modular pipeline of independently developed components, errors made in these “single-best” hypotheses can propagate to downstream estimation steps that treat these inputs as clean, trustworthy training data. In this work we integrate N-best alignments and parses by using a probability distribution over these alternatives to generate posterior fractional counts for use in downstream estimation. Using these fractional counts in a DOP-inspired syntax-based translation system, we show significant improvements in translation quality over a single-best trained baseline.

2007

pdf bib
An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT
Ashish Venugopal | Andreas Zollmann | Stephan Vogel
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
The Syntax Augmented MT (SAMT) System at the Shared Task for the 2007 ACL Workshop on Statistical Machine Translation
Andreas Zollmann | Ashish Venugopal | Matthias Paulik | Stephan Vogel
Proceedings of the Second Workshop on Statistical Machine Translation

pdf bib
The CMU-UKA statistical machine translation systems for IWSLT 2007
Ian Lane | Andreas Zollmann | Thuy Linh Nguyen | Nguyen Bach | Ashish Venugopal | Stephan Vogel | Kay Rottmann | Ying Zhang | Alex Waibel
Proceedings of the Fourth International Workshop on Spoken Language Translation

This paper describes the CMU-UKA statistical machine translation systems submitted to the IWSLT 2007 evaluation campaign. Systems were submitted for three language-pairs: Japanese→English, Chinese→English and Arabic→English. All systems were based on a common phrase-based SMT (statistical machine translation) framework but for each language-pair a specific research problem was tackled. For Japanese→English we focused on two problems: first, punctuation recovery, and second, how to incorporate topic-knowledge into the translation framework. Our Chinese→English submission focused on syntax-augmented SMT and for the Arabic→English task we focused on incorporating morphological-decomposition into the SMT framework. This research strategy enabled us to evaluate a wide variety of approaches which proved effective for the language pairs they were evaluated on.

2006

pdf bib
The CMU-UKA syntax augmented machine translation system for IWSLT-06
Andreas Zollmann | Ashish Venugopal | Stephan Vogel | Alex Waibel
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

pdf bib
Bridging the Inflection Morphology Gap for Arabic Statistical Machine Translation
Andreas Zollmann | Ashish Venugopal | Stephan Vogel
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

pdf bib
Syntax Augmented Machine Translation via Chart Parsing
Andreas Zollmann | Ashish Venugopal
Proceedings on the Workshop on Statistical Machine Translation

2005

pdf bib
Training and Evaluating Error Minimization Decision Rules for Statistical Machine Translation
Ashish Venugopal | Andreas Zollmann | Alex Waibel
Proceedings of the ACL Workshop on Building and Using Parallel Texts