Yanqing Zhao


pdf bib
Part Represents Whole: Improving the Evaluation of Machine Translation System Using Entropy Enhanced Metrics
Yilun Liu | Shimin Tao | Chang Su | Min Zhang | Yanqing Zhao | Hao Yang
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022

Machine translation (MT) metrics often experience poor correlations with human assessments. In terms of MT system evaluation, most metrics pay equal attentions to every sample in an evaluation set, while in human evaluation, difficult sentences often make candidate systems distinguishable via notable fluctuations in human scores, especially when systems are competitive. We find that samples with high entropy values, which though usually count less than 5%, tend to play a key role in MT evaluation: when the evaluation set is shrunk to only the high-entropy portion, correlations with human assessments are actually improved. Thus, in this paper, we propose a fast and unsupervised approach to enhance MT metrics using entropy, expanding the dimension of evaluation by introducing sentence-level difficulty. A translation hypothesis with a significantly high entropy value is considered difficult and receives a large weight in aggregation of system-level scores. Experimental results on five sub-tracks in the WMT19 Metrics shared tasks show that our proposed method significantly enhanced the performance of commonly-used MT metrics in terms of system-level correlations with human assessments, even outperforming existing SOTA metrics. In particular, all enhanced metrics exhibit overall stability in correlations with human assessments in circumstances where only competitive MT systems are included, while the corresponding vanilla metrics fail to correlate with human assessments.

pdf bib
Partial Could Be Better than Whole. HW-TSC 2022 Submission for the Metrics Shared Task
Yilun Liu | Xiaosong Qiao | Zhanglin Wu | Su Chang | Min Zhang | Yanqing Zhao | Song Peng | Shimin Tao | Hao Yang | Ying Qin | Jiaxin Guo | Minghan Wang | Yinglu Li | Peng Li | Xiaofeng Zhao
Proceedings of the Seventh Conference on Machine Translation (WMT)

In this paper, we present the contribution of HW-TSC to WMT 2022 Metrics Shared Task. We propose one reference-based metric, HWTSC-EE-BERTScore*, and four referencefree metrics including HWTSC-Teacher-Sim, HWTSC-TLM, KG-BERTScore and CROSSQE. Among these metrics, HWTSC-Teacher-Sim and CROSS-QE are supervised, whereas HWTSC-EE-BERTScore*, HWTSC-TLM and KG-BERTScore are unsupervised. We use these metrics in the segment-level and systemlevel tracks. Overall, our systems achieve strong results for all language pairs on previous test sets and a new state-of-the-art in many sys-level case sets.

pdf bib
HW-TSC Translation Systems for the WMT22 Chat Translation Task
Jinlong Yang | Zongyao Li | Daimeng Wei | Hengchao Shang | Xiaoyu Chen | Zhengzhe Yu | Zhiqiang Rao | Shaojun Li | Zhanglin Wu | Yuhao Xie | Yuanchang Luo | Ting Zhu | Yanqing Zhao | Lizhi Lei | Hao Yang | Ying Qin
Proceedings of the Seventh Conference on Machine Translation (WMT)

This paper describes the submissions of Huawei Translation Services Center (HW-TSC) to WMT22 chat translation shared task on English-Germany (en-de) bidirection with results of zore-shot and few-shot tracks. We use the deep transformer architecture with a lager parameter size. Our submissions to the WMT21 News Translation task are used as the baselines. We adopt strategies such as back translation, forward translation, domain transfer, data selection, and noisy forward translation in task, and achieve competitive results on the development set. We also test the effectiveness of document translation on chat tasks. Due to the lack of chat data, the results on the development set show that it is not as effective as sentence-level translation models.


pdf bib
A CRF Sequence Labeling Approach to Chinese Punctuation Prediction
Yanqing Zhao | Chaoyue Wang | Guohong Fu
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation