Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task

Keito Kudo; Hiroyuki Deguchi; Makoto Morishita; Ryo Fujii; Takumi Ito; Shintaro Ozaki; Koki Natsumi; Kai Sato; Kazuki Yano; Ryosuke Takahashi; Subaru Kimura; Tomomasa Hara; Yusuke Sakai; Jun Suzuki

doi:10.18653/v1/2024.wmt-1.14

Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task

Keito Kudo, Hiroyuki Deguchi, Makoto Morishita, Ryo Fujii, Takumi Ito, Shintaro Ozaki, Koki Natsumi, Kai Sato, Kazuki Yano, Ryosuke Takahashi, Subaru Kimura, Tomomasa Hara, Yusuke Sakai, Jun Suzuki

Abstract

We participated in the constrained track for English-Japanese and Japanese-Chinese translations at the WMT 2024 General Machine Translation Task. Our approach was to generate a large number of sentence-level translation candidates and select the most probable translation using minimum Bayes risk (MBR) decoding and document-level large language model (LLM) re-ranking. We first generated hundreds of translation candidates from multiple translation models and retained the top 30 candidates using MBR decoding. In addition, we continually pre-trained LLMs on the target language corpora to leverage document-level information. We utilized LLMs to select the most probable sentence sequentially in context from the beginning of the document.

Anthology ID:: 2024.wmt-1.14
Volume:: Proceedings of the Ninth Conference on Machine Translation
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venues:: WMT | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 210–226
Language:
URL:: https://aclanthology.org/2024.wmt-1.14/
DOI:: 10.18653/v1/2024.wmt-1.14
Bibkey:
Cite (ACL):: Keito Kudo, Hiroyuki Deguchi, Makoto Morishita, Ryo Fujii, Takumi Ito, Shintaro Ozaki, Koki Natsumi, Kai Sato, Kazuki Yano, Ryosuke Takahashi, Subaru Kimura, Tomomasa Hara, Yusuke Sakai, and Jun Suzuki. 2024. Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task. In Proceedings of the Ninth Conference on Machine Translation, pages 210–226, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task (Kudo et al., WMT 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.wmt-1.14.pdf

PDF Cite Search Fix data