Anqi Zhao
2024
Context-Aware Non-Autoregressive Document-Level Translation with Sentence-Aligned Connectionist Temporal Classification
Hao Yu
|
Kaiyu Huang
|
Anqi Zhao
|
Junpeng Liu
|
Degen Huang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Previous studies employ the autoregressive translation (AT) paradigm in the document-to-document neural machine translation. These methods extend the translation unit from a single sentence to a pseudo-document and encodes the full pseudo-document, avoiding the redundant computation problem in context. However, the AT methods cannot parallelize decoding and struggle with error accumulation, especially when the length of sentences increases. In this work, we propose a context-aware non-autoregressive framework with the sentence-aligned connectionist temporal classification (SA-CTC) loss for document-level neural machine translation. In particular, the SA-CTC loss reduces the search space of the decoding path by fixing the positions of the beginning and end tokens for each sentence in the document. Meanwhile, the context-aware architecture introduces preset nodes to represent sentence-level information and utilizes a hierarchical attention structure to regulate the attention hypothesis space. Experimental results show that our proposed method can achieve competitive performance compared with several strong baselines. Our method implements non-autoregressive modeling in Doc-to-Doc translation manner, achieving an average 46X decoding speedup compared to the document-level AT baselines on three benchmarks.
2023
DUTNLP System for the WMT2023 Discourse-Level Literary Translation
Anqi Zhao
|
Kaiyu Huang
|
Hao Yu
|
Degen Huang
Proceedings of the Eighth Conference on Machine Translation
This paper describes the submission of DUTNLP Lab submission to WMT23 Discourse-Level Literary Translation in the Chinese to English translation direction under unconstrained conditions. Our primary system aims to leverage a large language model with various prompt strategies, which can fully investigate the potential capabilities of large language models for discourse-level neural machine translation. Moreover, we test a widely used discourse-level machine translation model, G-transformer, with different training strategies. In our experimental results, the method with large language models achieves a BLEU score of 28.16, while the fine-tuned method scores 25.26. These findings indicate that selecting appropriate prompt strategies based on large language models can significantly improve translation performance compared to traditional model training methods.