Kenji Yamada


2006

pdf bib
Unsupervised Analysis for Decipherment Problems
Kevin Knight | Anish Nair | Nishit Rathod | Kenji Yamada
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

2005

pdf bib
Une approche à la traduction automatique statistique par segments discontinus
Michel Simard | Nicola Cancedda | Bruno Cavestro | Marc Dymetman | Eric Gaussier | Cyril Goutte | Philippe Langlais | Arne Mauser | Kenji Yamada
Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Cet article présente une méthode de traduction automatique statistique basée sur des segments non-continus, c’est-à-dire des segments formés de mots qui ne se présentent pas nécéssairement de façon contiguë dans le texte. On propose une méthode pour produire de tels segments à partir de corpus alignés au niveau des mots. On présente également un modèle de traduction statistique capable de tenir compte de tels segments, de même qu’une méthode d’apprentissage des paramètres du modèle visant à maximiser l’exactitude des traductions produites, telle que mesurée avec la métrique NIST. Les traductions optimales sont produites par le biais d’une recherche en faisceau. On présente finalement des résultats expérimentaux, qui démontrent comment la méthode proposée permet une meilleure généralisation à partir des données d’entraînement.

pdf bib
Translating with Non-contiguous Phrases
Michel Simard | Nicola Cancedda | Bruno Cavestro | Marc Dymetman | Eric Gaussier | Cyril Goutte | Kenji Yamada | Philippe Langlais | Arne Mauser
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Aligning words using matrix factorisation
Cyril Goutte | Kenji Yamada | Eric Gaussier
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

pdf bib
A Smorgasbord of Features for Statistical Machine Translation
Franz Josef Och | Daniel Gildea | Sanjeev Khudanpur | Anoop Sarkar | Kenji Yamada | Alex Fraser | Shankar Kumar | Libin Shen | David Smith | Katherine Eng | Viren Jain | Zhen Jin | Dragomir Radev
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics: HLT-NAACL 2004

2003

pdf bib
Towards Interactive Text Understanding
Marc Dymetman | Aurélien Max | Kenji Yamada
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Reducing Parameter Space for Word Alignment
Herve Dejean | Eric Gaussier | Cyril Goutte | Kenji Yamada
Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond

pdf bib
Syntax-based language models for statistical machine translation
Eugene Charniak | Kevin Knight | Kenji Yamada
Proceedings of Machine Translation Summit IX: Papers

We present a syntax-based language model for use in noisy-channel machine translation. In particular, a language model based upon that described in (Cha01) is combined with the syntax based translation-model described in (YK01). The resulting system was used to translate 347 sentences from Chinese to English and compared with the results of an IBM-model-4-based system, as well as that of (YK02), all trained on the same data. The translations were sorted into four groups: good/bad syntax crossed with good/bad meaning. While the total number of translations that preserved meaning were the same for (YK02) and the syntax-based system (and both higher than the IBM-model-4-based system), the syntax based system had 45% more translations that also had good syntax than did (YK02) (and approximately 70% more than IBM Model 4). The number of translations that did not preserve meaning, but at least had good grammar, also increased, though to less avail.

pdf bib
Improving translation models by applying asymmetric learning
Setsuo Yamada | Masaaki Nagata | Kenji Yamada
Proceedings of Machine Translation Summit IX: Papers

The statistical Machine Translation Model has two components: a language model and a translation model. This paper describes how to improve the quality of the translation model by using the common word pairs extracted by two asymmetric learning approaches. One set of word pairs is extracted by Viterbi alignment using a translation model, the other set is extracted by Viterbi alignment using another translation model created by reversing the languages. The common word pairs are extracted as the same word pairs in the two sets of word pairs. We conducted experiments using English and Japanese. Our method improves the quality of a original translation model by 5.7%. The experiments also show that the proposed learning method improves the word alignment quality independent of the training domain and the translation model. Moreover, we show that common word pairs are almost as useful as regular dictionary entries for training purposes.

2002

pdf bib
The Importance of Lexicalized Syntax Models for Natural Language Generation Tasks
Hal Daume III | Kevin Knight | Irene Langkilde-Geary | Daniel Marcu | Kenji Yamada
Proceedings of the International Natural Language Generation Conference

pdf bib
A Decoder for Syntax-based Statistical MT
Kenji Yamada | Kevin Knight
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

2001

pdf bib
Fast Decoding and Optimal Decoding for Machine Translation
Ulrich Germann | Michael Jahr | Kevin Knight | Daniel Marcu | Kenji Yamada
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

pdf bib
A Syntax-based Statistical Translation Model
Kenji Yamada | Kevin Knight
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

1999

pdf bib
A Computational Approach to Deciphering Unknown Scripts
Kevin Knight | Kenji Yamada
Unsupervised Learning in Natural Language Processing

1996

pdf bib
A controlled skip parser
Kenji Yamada
Conference of the Association for Machine Translation in the Americas

pdf bib
JAPANGLOSS: using statistics to fill knowledge gaps
Kevin Knight | Yaser Al-Onaizan | Ishwar Chander | Eduard Hovy | Irene Langkilde | Richard Whitney | Kenji Yamada
Conference of the Association for Machine Translation in the Americas

1994

pdf bib
Integrating Knowledge Bases and Statistics in MT
Kevin Knight | Ishwar Chander | Matthew Haines | Vasileios Hatzivassiloglou | Eduard Hovy | Masayo Iida | Steve K. Luk | Akitoshi Okumura | Richard Whitney | Kenji Yamada
Proceedings of the First Conference of the Association for Machine Translation in the Americas