AnchiLm: An Effective Classical-to-Modern Chinese Translation Model Leveraging bpe-drop and SikuRoBERTa

Jiahui Zhu, Sizhou Chen


Abstract
In this paper, we present our submitted model for translating ancient to modern texts, which ranked sixth in the closed track of ancient Chinese in the 2nd International Review of Automatic Analysis of Ancient Chinese (EvaHan). Specifically, we employed two strategies to improve the translation from ancient to modern texts. First, we used bpe-drop to enhance the parallel corpus. Second, we use SikuRoBERTa to simultaneously initialize the translation model’s codec and reconstruct the bpe word list. In our experiments, we compare the baseline model, rdrop, pre-trained model, and parameter initialization methods. The experimental results show that the parameter initialization method in this paper significantly outperforms the baseline model in terms of performance, and its BLEU score reaches 21.75.
Anthology ID:
2023.alt-1.8
Volume:
Proceedings of ALT2023: Ancient Language Translation Workshop
Month:
September
Year:
2023
Address:
Macau SAR, China
Venue:
alt
SIG:
Publisher:
Asia-Pacific Association for Machine Translation
Note:
Pages:
55–60
Language:
URL:
https://aclanthology.org/2023.alt-1.8
DOI:
Bibkey:
Cite (ACL):
Jiahui Zhu and Sizhou Chen. 2023. AnchiLm: An Effective Classical-to-Modern Chinese Translation Model Leveraging bpe-drop and SikuRoBERTa. In Proceedings of ALT2023: Ancient Language Translation Workshop, pages 55–60, Macau SAR, China. Asia-Pacific Association for Machine Translation.
Cite (Informal):
AnchiLm: An Effective Classical-to-Modern Chinese Translation Model Leveraging bpe-drop and SikuRoBERTa (Zhu & Chen, alt 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.alt-1.8.pdf