Zhiquan Cao


pdf bib
Augmenting Large Language Model Translators via Translation Memories
Yongyu Mu | Abudurexiti Reheman | Zhiquan Cao | Yuchun Fan | Bei Li | Yinqiao Li | Tong Xiao | Chunliang Zhang | Jingbo Zhu
Findings of the Association for Computational Linguistics: ACL 2023

Using translation memories (TMs) as prompts is a promising approach to in-context learning of machine translation models. In this work, we take a step towards prompting large language models (LLMs) with TMs and making them better translators. We find that the ability of LLMs to “understand” prompts is indeed helpful for making better use of TMs. Experiments show that the results of a pre-trained LLM translator can be greatly improved by using high-quality TM-based prompts. These results are even comparable to those of the state-of-the-art NMT systems which have access to large-scale in-domain bilingual data and are well tuned on the downstream tasks.

pdf bib
Improving Autoregressive Grammatical Error Correction with Non-autoregressive Models
Hang Cao | Zhiquan Cao | Chi Hu | Baoyu Hou | Tong Xiao | Jingbo Zhu
Findings of the Association for Computational Linguistics: ACL 2023

Grammatical Error Correction (GEC) aims to correct grammatical errors in sentences. We find that autoregressive models tend to assign low probabilities to tokens that need corrections. Here we introduce additional signals to the training of GEC models so that these systems can learn to better predict at ambiguous positions. To do this, we use a non-autoregressive model as an auxiliary model, and develop a new regularization term of training by considering the difference in predictions between the autoregressive and non-autoregressive models. We experiment with this method on both English and Chinese GEC tasks. Experimental results show that our GEC system outperforms the baselines on all the data sets significantly.


pdf bib
The NiuTrans Machine Translation Systems for WMT22
Weiqiao Shan | Zhiquan Cao | Yuchen Han | Siming Wu | Yimin Hu | Jie Wang | Yi Zhang | Hou Baoyu | Hang Cao | Chenghao Gao | Xiaowen Liu | Tong Xiao | Anxiang Ma | Jingbo Zhu
Proceedings of the Seventh Conference on Machine Translation (WMT)

This paper describes the NiuTrans neural machine translation systems of the WMT22 General MT constrained task. We participate in four directions, including Chinese→English, English→Croatian, and Livonian↔English. Our models are based on several advanced Transformer variants, e.g., Transformer-ODE, Universal Multiscale Transformer (UMST). The main workflow consists of data filtering, large-scale data augmentation (i.e., iterative back-translation, iterative knowledge distillation), and specific-domain fine-tuning. Moreover, we try several multi-domain methods, such as a multi-domain model structure and a multi-domain data clustering method, to rise to this year’s newly proposed multi-domain test set challenge. For low-resource scenarios, we build a multi-language translation model to enhance the performance, and try to use the pre-trained language model (mBERT) to initialize the translation model.