Scaling Law for Document Neural Machine Translation

Zhang Zhuocheng, Shuhao Gu, Min Zhang, Yang Feng


Abstract
The scaling laws of language models have played a significant role in advancing large language models. In order to promote the development of document translation, we systematically examine the scaling laws in this field. In this paper, we carry out an in-depth analysis of the influence of three factors on translation quality: model scale, data scale, and sequence length. Our findings reveal that increasing sequence length effectively enhances model performance when model size is limited. However, sequence length cannot be infinitely extended; it must be suitably aligned with the model scale and corpus volume. Further research shows that providing adequate context can effectively enhance the translation quality of a document’s initial portion. Nonetheless, exposure bias remains the primary factor hindering further improvement in translation quality for the latter half of the document.
Anthology ID:
2023.findings-emnlp.556
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8290–8303
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.556
DOI:
10.18653/v1/2023.findings-emnlp.556
Bibkey:
Cite (ACL):
Zhang Zhuocheng, Shuhao Gu, Min Zhang, and Yang Feng. 2023. Scaling Law for Document Neural Machine Translation. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 8290–8303, Singapore. Association for Computational Linguistics.
Cite (Informal):
Scaling Law for Document Neural Machine Translation (Zhuocheng et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-emnlp.556.pdf