Jiawei Zheng

2025

pdf bib abs
SubDocTrans: Enhancing Document-level Machine Translation with Plug-and-play Multi-granularity Knowledge Augmentation
Hanghai Hong | Yibo Xie | Jiawei Zheng | Xiaoli Wang
Findings of the Association for Computational Linguistics: EMNLP 2025

Large language models (LLMs) have recently achieved remarkable progress in sentence-level machine translation, but scaling to document-level machine translation (DocMT) remains challenging, particularly in modeling long-range dependencies and discourse phenomena across sentences and paragraphs. Document translations generated by LLMs often suffer from poor consistency, weak coherence, and omission errors. To address these issues, we propose SubDocTrans, a novel DocMT framework that enables LLMs to produce high-quality translations through plug-and-play, multi-granularity knowledge extraction and integration. SubDocTrans first performs topic segmentation to divide a document into coherent topic sub-documents. For each sub-document, both global and local knowledge are extracted including bilingual summary, theme, proper nouns, topics, and transition hint. We then incorporate this multi-granularity knowledge into the prompting strategy, to guide LLMs in producing consistent, coherent, and accurate translations. We conduct extensive experiments across various DocMT tasks, and the results demonstrate the effectiveness of our framework, particularly in improving consistency and coherence, reducing omission errors, and mitigating hallucinations.

2023

This paper presents the submission of Huawei Translation Services Center for the IWSLT 2023 dubbing task in the unconstrained setting. The proposed solution consists of a Transformer-based machine translation model and a phoneme duration predictor. The Transformer is deep and multiple target-to-source length-ratio class labels are used to control target lengths. The variation predictor in FastSpeech2 is utilized to predict phoneme durations. To optimize the isochrony in dubbing, re-ranking and scaling are performed. The source audio duration is used as a reference to re-rank the translations of different length-ratio labels, and the one with minimum time deviation is preferred. Additionally, the phoneme duration outputs are scaled within a defined threshold to narrow the duration gap with the source audio.

This paper presents Huawei Translation Service Center (HW-TSC)’s submission on the IWSLT 2023 formality control task, which provides two training scenarios: supervised and zero-shot, each containing two language pairs, and sets constrained and unconstrained conditions. We train the formality control models for these four language pairs under these two conditions respectively, and submit the corresponding translation results. Our efforts are divided into two fronts: enhancing general translation quality and improving formality control capability. According to the different requirements of the formality control task, we use a multi-stage pre-training method to train a bilingual or multilingual neural machine translation (NMT) model as the basic model, which can improve the general translation quality of the base model to a relatively high level. Then, under the premise of affecting the general translation quality of the basic model as little as possible, we adopt domain adaptation and reranking-based transductive learning methods to improve the formality control capability of the model.