Kenji Yamada

2006

Unsupervised Analysis for Decipherment Problems
Kevin Knight | Anish Nair | Nishit Rathod | Kenji Yamada
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2005

pdf bib abs

Cet article présente une méthode de traduction automatique statistique basée sur des segments non-continus, c’est-à-dire des segments formés de mots qui ne se présentent pas nécéssairement de façon contiguë dans le texte. On propose une méthode pour produire de tels segments à partir de corpus alignés au niveau des mots. On présente également un modèle de traduction statistique capable de tenir compte de tels segments, de même qu’une méthode d’apprentissage des paramètres du modèle visant à maximiser l’exactitude des traductions produites, telle que mesurée avec la métrique NIST. Les traductions optimales sont produites par le biais d’une recherche en faisceau. On présente finalement des résultats expérimentaux, qui démontrent comment la méthode proposée permet une meilleure généralisation à partir des données d’entraînement.

pdf bib

2004

pdf bib

pdf bib

Aligning words using matrix factorisation
Cyril Goutte | Kenji Yamada | Eric Gaussier
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

2003

pdf bib abs

Syntax-based language models for statistical machine translation
Eugene Charniak | Kevin Knight | Kenji Yamada
Proceedings of Machine Translation Summit IX: Papers

We present a syntax-based language model for use in noisy-channel machine translation. In particular, a language model based upon that described in (Cha01) is combined with the syntax based translation-model described in (YK01). The resulting system was used to translate 347 sentences from Chinese to English and compared with the results of an IBM-model-4-based system, as well as that of (YK02), all trained on the same data. The translations were sorted into four groups: good/bad syntax crossed with good/bad meaning. While the total number of translations that preserved meaning were the same for (YK02) and the syntax-based system (and both higher than the IBM-model-4-based system), the syntax based system had 45% more translations that also had good syntax than did (YK02) (and approximately 70% more than IBM Model 4). The number of translations that did not preserve meaning, but at least had good grammar, also increased, though to less avail.

pdf bib abs

Improving translation models by applying asymmetric learning
Setsuo Yamada | Masaaki Nagata | Kenji Yamada
Proceedings of Machine Translation Summit IX: Papers

The statistical Machine Translation Model has two components: a language model and a translation model. This paper describes how to improve the quality of the translation model by using the common word pairs extracted by two asymmetric learning approaches. One set of word pairs is extracted by Viterbi alignment using a translation model, the other set is extracted by Viterbi alignment using another translation model created by reversing the languages. The common word pairs are extracted as the same word pairs in the two sets of word pairs. We conducted experiments using English and Japanese. Our method improves the quality of a original translation model by 5.7%. The experiments also show that the proposed learning method improves the word alignment quality independent of the training domain and the translation model. Moreover, we show that common word pairs are almost as useful as regular dictionary entries for training purposes.

pdf bib

Towards Interactive Text Understanding
Marc Dymetman | Aurélien Max | Kenji Yamada
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

pdf bib

Reducing Parameter Space for Word Alignment
Herve Dejean | Eric Gaussier | Cyril Goutte | Kenji Yamada
Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond

Kenji Yamada

2006

2005

2004

2003

2002

2001

1999

1996

1994

Co-authors

Venues