SCIR-MT’s Submission for WMT24 General Machine Translation Task

Baohang Li, Zekai Ye, Yichong Huang, Xiaocheng Feng, Bing Qin


Abstract
This paper introduces the submission of SCIR research center of Harbin Institute of Technology participating in the WMT24 machine translation evaluation task of constrained track for English to Czech. Our approach involved a rigorous process of cleaning and deduplicating both monolingual and bilingual data, followed by a three-stage model training recipe. During the testing phase, we used the beam serach decoding method to generate a large number of candidate translations. Furthermore, we employed COMET-MBR decoding to identify optimal translations.
Anthology ID:
2024.wmt-1.21
Volume:
Proceedings of the Ninth Conference on Machine Translation
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
280–285
Language:
URL:
https://aclanthology.org/2024.wmt-1.21
DOI:
Bibkey:
Cite (ACL):
Baohang Li, Zekai Ye, Yichong Huang, Xiaocheng Feng, and Bing Qin. 2024. SCIR-MT’s Submission for WMT24 General Machine Translation Task. In Proceedings of the Ninth Conference on Machine Translation, pages 280–285, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
SCIR-MT’s Submission for WMT24 General Machine Translation Task (Li et al., WMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.wmt-1.21.pdf