Kazuki Yano
2024
Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task
Keito Kudo
|
Hiroyuki Deguchi
|
Makoto Morishita
|
Ryo Fujii
|
Takumi Ito
|
Shintaro Ozaki
|
Koki Natsumi
|
Kai Sato
|
Kazuki Yano
|
Ryosuke Takahashi
|
Subaru Kimura
|
Tomomasa Hara
|
Yusuke Sakai
|
Jun Suzuki
Proceedings of the Ninth Conference on Machine Translation
We participated in the constrained track for English-Japanese and Japanese-Chinese translations at the WMT 2024 General Machine Translation Task. Our approach was to generate a large number of sentence-level translation candidates and select the most probable translation using minimum Bayes risk (MBR) decoding and document-level large language model (LLM) re-ranking. We first generated hundreds of translation candidates from multiple translation models and retained the top 30 candidates using MBR decoding. In addition, we continually pre-trained LLMs on the target language corpora to leverage document-level information. We utilized LLMs to select the most probable sentence sequentially in context from the beginning of the document.
Search
Fix data
Co-authors
- Hiroyuki Deguchi 1
- Ryo Fujii 1
- Tomomasa Hara 1
- Takumi Ito 1
- Subaru Kimura 1
- show all...
Venues
- wmt1