Tetsuji Nakagawa

2025

In this paper, we present our submissions to the unified WMT25 Translation Evaluation Shared Task. For the Quality Score Prediction subtask, we create a new generation of MetricX with improvements in the input format and the training protocol, while for the Error Span Detection subtask we develop a new model, GemSpanEval, trained to predict error spans along with their severities and categories. Both systems are based on the state-of-the-art multilingual open-weights model Gemma 3, fine-tuned on publicly available WMT data. We demonstrate that MetricX-25, adapting Gemma 3 to an encoder-only architecture with a regression head on top, can be trained to effectively predict both MQM and ESA quality scores, and significantly outperforms its predecessor. Our decoder-only GemSpanEval model, on the other hand, we show to be competitive in error span detection with xCOMET, a strong encoder-only sequence-tagging baseline. With error span detection formulated as a generative task, we instruct the model to also output the context for each predicted error span, thus ensuring that error spans are identified unambiguously.

2023

pdf bib abs

LEALLA: Learning Lightweight Language-agnostic Sentence Embeddings with Knowledge Distillation
Zhuoyuan Mao | Tetsuji Nakagawa
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Large-scale language-agnostic sentence embedding models such as LaBSE (Feng et al., 2022) obtain state-of-the-art performance for parallel sentence alignment. However, these large-scale models can suffer from inference speed and computation overhead. This study systematically explores learning language-agnostic sentence embeddings with lightweight models. We demonstrate that a thin-deep encoder can construct robust low-dimensional sentence embeddings for 109 languages. With our proposed distillation methods, we achieve further improvements by incorporating knowledge from a teacher model. Empirical results on Tatoeba, United Nations, and BUCC show the effectiveness of our lightweight models. We release our lightweight language-agnostic sentence embedding models LEALLA on TensorFlow Hub.

2018

pdf bib abs

Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection
Wei Wang | Taro Watanabe | Macduff Hughes | Tetsuji Nakagawa | Ciprian Chelba
Proceedings of the Third Conference on Machine Translation: Research Papers

Measuring domain relevance of data and identifying or selecting well-fit domain data for machine translation (MT) is a well-studied topic, but denoising is not yet. Denoising is concerned with a different type of data quality and tries to reduce the negative impact of data noise on MT training, in particular, neural MT (NMT) training. This paper generalizes methods for measuring and selecting data for domain MT and applies them to denoising NMT training. The proposed approach uses trusted data and a denoising curriculum realized by online data selection. Intrinsic and extrinsic evaluations of the approach show its significant effectiveness for NMT to train on data with severe noise.

2017

pdf bib

An Empirical Study of Language Relatedness for Transfer Learning in Neural Machine Translation
Raj Dabre | Tetsuji Nakagawa | Hideto Kazawa
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation

2016

pdf bib abs

Phrase-based Machine Translation using Multiple Preordering Candidates
Yusuke Oda | Taku Kudo | Tetsuji Nakagawa | Taro Watanabe
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

In this paper, we propose a new decoding method for phrase-based statistical machine translation which directly uses multiple preordering candidates as a graph structure. Compared with previous phrase-based decoding methods, our method is based on a simple left-to-right dynamic programming in which no decoding-time reordering is performed. As a result, its runtime is very fast and implementing the algorithm becomes easy. Our system does not depend on specific preordering methods as long as they output multiple preordering candidates, and it is trivial to employ existing preordering methods into our system. In our experiments for translating diverse 11 languages into English, the proposed method outperforms conventional phrase-based decoder in terms of translation qualities under comparable or faster decoding time.

Co-authors

Venues