2024
pdf
bib
abs
HW-TSC’s Speech to Text Translation System for IWSLT 2024 in Indic track
Bin Wei
|
Zongyao Li
|
Jiaxin Guo
|
Daimeng Wei
|
Zhanglin Wu
|
Xiaoyu Chen
|
Zhiqiang Rao
|
Shaojun Li
|
Yuanchang Luo
|
Hengchao Shang
|
Hao Yang
|
Yanfei Jiang
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
This article introduces the process of HW-TSC and the results of IWSLT 2024 Indic Track Speech to Text Translation. We designed a cascade system consisting of an ASR model and a machine translation model to translate speech from one language to another. For the ASR part, we directly use whisper large v3 as our ASR model. Our main task is to optimize the machine translation model (en2ta, en2hi, en2bn). In the process of optimizing the translation model, we first use bilingual corpus to train the baseline model. Then we use monolingual data to construct pseudo-corpus data to further enhance the baseline model. Finally, we filter the parallel corpus data through the labse filtering method and finetune the model again, which can further improve the bleu value. We also selected domain data from bilingual corpus to finetune previous model to achieve the best results.
pdf
bib
abs
HW-TSC’s Submissions To the IWSLT2024 Low-resource Speech Translation Tasks
Zheng Jiawei
|
Hengchao Shang
|
Zongyao Li
|
Zhanglin Wu
|
Daimeng Wei
|
Zhiqiang Rao
|
Shaojun Li
|
Jiaxin Guo
|
Bin Wei
|
Yuanchang Luo
|
Hao Yang
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
In this work, we submitted our systems to the low-resource track of the IWSLT 2024 Speech Translation Campaign. Our systems tackled the unconstrained condition of the Dialectal Arabic North Levantine (ISO-3 code: apc) to English language pair. We proposed a cascaded solution consisting of an automatic speech recognition (ASR) model and a machine translation (MT) model. It was noted that the ASR model employed the pre-trained Whisper-large-v3 model to process the speech data, while the MT model adopted the Transformer architecture. To improve the quality of the MT model, it was stated that our system utilized not only the data provided by the competition but also an additional 54 million parallel sentences. Ultimately, we reported that our final system achieved a BLEU score of 24.7 for apc-to-English translation.
pdf
bib
abs
HW-TSC’s Simultaneous Speech Translation System for IWSLT 2024
Shaojun Li
|
Zhiqiang Rao
|
Bin Wei
|
Yuanchang Luo
|
Zhanglin Wu
|
Zongyao Li
|
Hengchao Shang
|
Jiaxin Guo
|
Daimeng Wei
|
Hao Yang
Proceedings of the 21st International Conference on Spoken Language Translation (IWSLT 2024)
This paper outlines our submission for the IWSLT 2024 Simultaneous Speech-to-Text (SimulS2T) and Speech-to-Speech (SimulS2S) Translation competition. We have engaged in all four language directions and both the SimulS2T and SimulS2S tracks: English-German (EN-DE), English-Chinese (EN-ZH), English-Japanese (EN-JA), and Czech-English (CS-EN). For the S2T track, we have built upon our previous year’s system and further honed the cascade system composed of ASR model and MT model. Concurrently, we have introduced an end-to-end system specifically for the CS-EN direction. This end-to-end (E2E) system primarily employs the pre-trained seamlessM4T model. In relation to the SimulS2S track, we have integrated a novel TTS model into our SimulS2T system. The final submission for the S2T directions of EN-DE, EN-ZH, and EN-JA has been refined over our championship system from last year. Building upon this foundation, the incorporation of the new TTS into our SimulS2S system has resulted in the ASR-BLEU surpassing last year’s best score.
pdf
bib
abs
Machine Translation Advancements of Low-Resource Indian Languages by Transfer Learning
Bin Wei
|
Zheng Jiawei
|
Zongyao Li
|
Zhanglin Wu
|
Jiaxin Guo
|
Daimeng Wei
|
Zhiqiang Rao
|
Shaojun Li
|
Yuanchang Luo
|
Hengchao Shang
|
Jinlong Yang
|
Yuhao Xie
|
Hao Yang
Proceedings of the Ninth Conference on Machine Translation
This paper introduces the submission by Huawei Translation Center (HW-TSC) to the WMT24 Indian Languages Machine Translation (MT) Shared Task. To develop a reliable machine translation system for low-resource Indian languages, we employed two distinct knowledge transfer strategies, taking into account the characteristics of the language scripts and the support available from existing open-source models for Indian languages. For Assamese(as) and Manipuri(mn), we fine-tuned the existing IndicTrans2 open-source model to enable bidirectional translation between English and these languages. For Khasi(kh) and Mizo(mz), we trained a multilingual model as the baseline using bilingual data from this four language pairs as well as additional Bengali data, which share the same language family. This was followed by fine-tuning to achieve bidirectional translation between English and Khasi, as well as English and Mizo. Our transfer learning experiments produced significant results: 23.5 BLEU for en→as, 31.8 BLEU for en→mn, 36.2 BLEU for as→en, and 47.9 BLEU for mn→en on their respective test sets. Similarly, the multilingual model transfer learning experiments yielded impressive outcomes, achieving 19.7 BLEU for en→kh, 32.8 BLEU for en→mz, 16.1 BLEU for kh→en, and 33.9 BLEU for mz→en on their respective test sets. These results not only highlight the effectiveness of transfer learning techniques for low-resource languages but also contribute to advancing machine translation capabilities for low-resource Indian languages.
pdf
bib
abs
Multilingual Transfer and Domain Adaptation for Low-Resource Languages of Spain
Yuanchang Luo
|
Zhanglin Wu
|
Daimeng Wei
|
Hengchao Shang
|
Zongyao Li
|
Jiaxin Guo
|
Zhiqiang Rao
|
Shaojun Li
|
Jinlong Yang
|
Yuhao Xie
|
Zheng Jiawei
|
Bin Wei
|
Hao Yang
Proceedings of the Ninth Conference on Machine Translation
This article introduces the submission status of the Translation into Low-Resource Languages of Spain task at (WMT 2024) by Huawei Translation Service Center (HW-TSC). We participated in three translation tasks: spanish to aragonese (es2arg), spanish to aranese (es2arn), and spanish to asturian (es2ast). For these three translation tasks, we use training strategies such as multilingual transfer, regularized dropout, forward translation and back translation, labse denoising, transduction ensemble learning and other strategies to neural machine translation (NMT) model based on training deep transformer-big architecture. By using these enhancement strategies, our submission achieved a competitive result in the final evaluation.
pdf
bib
abs
Exploring the Traditional NMT Model and Large Language Model for Chat Translation
Jinlong Yang
|
Hengchao Shang
|
Daimeng Wei
|
Jiaxin Guo
|
Zongyao Li
|
Zhanglin Wu
|
Zhiqiang Rao
|
Shaojun Li
|
Yuhao Xie
|
Yuanchang Luo
|
Zheng Jiawei
|
Bin Wei
|
Hao Yang
Proceedings of the Ninth Conference on Machine Translation
This paper describes the submissions of Huawei Translation Services Center(HW-TSC) to WMT24 chat translation shared task on English↔Germany (en-de) bidirection. The experiments involved fine-tuning models using chat data and exploring various strategies, including Minimum Bayesian Risk (MBR) decoding and self-training. The results show significant performance improvements in certain directions, with the MBR self-training method achieving the best results. The Large Language Model also discusses the challenges and potential avenues for further research in the field of chat translation.
2023
pdf
bib
abs
Length-Aware NMT and Adaptive Duration for Automatic Dubbing
Zhiqiang Rao
|
Hengchao Shang
|
Jinlong Yang
|
Daimeng Wei
|
Zongyao Li
|
Jiaxin Guo
|
Shaojun Li
|
Zhengzhe Yu
|
Zhanglin Wu
|
Yuhao Xie
|
Bin Wei
|
Jiawei Zheng
|
Lizhi Lei
|
Hao Yang
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
This paper presents the submission of Huawei Translation Services Center for the IWSLT 2023 dubbing task in the unconstrained setting. The proposed solution consists of a Transformer-based machine translation model and a phoneme duration predictor. The Transformer is deep and multiple target-to-source length-ratio class labels are used to control target lengths. The variation predictor in FastSpeech2 is utilized to predict phoneme durations. To optimize the isochrony in dubbing, re-ranking and scaling are performed. The source audio duration is used as a reference to re-rank the translations of different length-ratio labels, and the one with minimum time deviation is preferred. Additionally, the phoneme duration outputs are scaled within a defined threshold to narrow the duration gap with the source audio.
pdf
bib
abs
Improving Neural Machine Translation Formality Control with Domain Adaptation and Reranking-based Transductive Learning
Zhanglin Wu
|
Zongyao Li
|
Daimeng Wei
|
Hengchao Shang
|
Jiaxin Guo
|
Xiaoyu Chen
|
Zhiqiang Rao
|
Zhengzhe Yu
|
Jinlong Yang
|
Shaojun Li
|
Yuhao Xie
|
Bin Wei
|
Jiawei Zheng
|
Ming Zhu
|
Lizhi Lei
|
Hao Yang
|
Yanfei Jiang
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
This paper presents Huawei Translation Service Center (HW-TSC)’s submission on the IWSLT 2023 formality control task, which provides two training scenarios: supervised and zero-shot, each containing two language pairs, and sets constrained and unconstrained conditions. We train the formality control models for these four language pairs under these two conditions respectively, and submit the corresponding translation results. Our efforts are divided into two fronts: enhancing general translation quality and improving formality control capability. According to the different requirements of the formality control task, we use a multi-stage pre-training method to train a bilingual or multilingual neural machine translation (NMT) model as the basic model, which can improve the general translation quality of the base model to a relatively high level. Then, under the premise of affecting the general translation quality of the basic model as little as possible, we adopt domain adaptation and reranking-based transductive learning methods to improve the formality control capability of the model.
2010
pdf
bib
Cross Lingual Adaptation: An Experiment on Sentiment Classifications
Bin Wei
|
Christopher Pal
Proceedings of the ACL 2010 Conference Short Papers