Yuan Gao


2023

pdf bib
Data Augmentation with Diversified Rephrasing for Low-Resource Neural Machine Translation
Yuan Gao | Feng Hou | Huia Jahnke | Ruili Wang
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track

Data augmentation is an effective way to enhance the performance of neural machine translation models, especially for low-resource languages. Existing data augmentation methods are either at a token level or a sentence level. The data augmented using token level methods lack syntactic diversity and may alter original meanings. Sentence level methods usually generate low-quality source sentences that are not semantically paired with the original target sentences. In this paper, we propose a novel data augmentation method to generate diverse, high-quality and meaning-preserved new instances. Our method leverages high-quality translation models trained with high-resource languages to rephrase an original sentence by translating it into an intermediate language and then back to the original language. Through this process, the high-performing translation models guarantee the quality of the rephrased sentences, and the syntactic knowledge from the intermediate language can bring syntactic diversity to the rephrased sentences. Experimental results show our method can enhance the performance in various low-resource machine translation tasks. Moreover, by combining our method with other techniques that facilitate NMT, we can yield even better results.

pdf bib
Composable Text Controls in Latent Space with ODEs
Guangyi Liu | Zeyu Feng | Yuan Gao | Zichao Yang | Xiaodan Liang | Junwei Bao | Xiaodong He | Shuguang Cui | Zhen Li | Zhiting Hu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Real-world text applications often involve composing a wide range of text control operations, such as editing the text w.r.t. an attribute, manipulating keywords and structure, and generating new text of desired properties. Prior work typically learns/finetunes a language model (LM) to perform individual or specific subsets of operations. Recent research has studied combining operations in a plug-and-play manner, often with costly search or optimization in the complex sequence space. This paper proposes a new efficient approach for composable text operations in the compact latent space of text. The low-dimensionality and differentiability of the text latent vector allow us to develop an efficient sampler based on ordinary differential equations (ODEs) given arbitrary plug-in operators (e.g., attribute classifiers). By connecting pretrained LMs (e.g., GPT2) to the latent space through efficient adaption, we then decode the sampled vectors into desired text sequences. The flexible approach permits diverse control operators (sentiment, tense, formality, keywords, etc.) acquired using any relevant data from different domains. Experiments show that composing those operators within our approach manages to generate or edit high-quality text, substantially improving over previous methods in terms of generation quality and efficiency.

pdf bib
On Prefix-tuning for Lightweight Out-of-distribution Detection
Yawen Ouyang | Yongchang Cao | Yuan Gao | Zhen Wu | Jianbing Zhang | Xinyu Dai
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Out-of-distribution (OOD) detection, a fundamental task vexing real-world applications, has attracted growing attention in the NLP community. Recently fine-tuning based methods have made promising progress. However, it could be costly to store fine-tuned models for each scenario. In this paper, we depart from the classic fine-tuning based OOD detection toward a parameter-efficient alternative, and propose an unsupervised prefix-tuning based OOD detection framework termed PTO. Additionally, to take advantage of optional training data labels and targeted OOD data, two practical extensions of PTO are further proposed. Overall, PTO and its extensions offer several key advantages of being lightweight, easy-to-reproduce, and theoretically justified. Experimental results show that our methods perform comparably to, even better than, existing fine-tuning based OOD detection approaches under a wide range of metrics, detection settings, and OOD types.

2020

pdf bib
WeChat Neural Machine Translation Systems for WMT20
Fandong Meng | Jianhao Yan | Yijin Liu | Yuan Gao | Xianfeng Zeng | Qinsong Zeng | Peng Li | Ming Chen | Jie Zhou | Sifan Liu | Hao Zhou
Proceedings of the Fifth Conference on Machine Translation

We participate in the WMT 2020 shared newstranslation task on Chinese→English. Our system is based on the Transformer (Vaswaniet al., 2017a) with effective variants and the DTMT (Meng and Zhang, 2019) architecture. In our experiments, we employ data selection, several synthetic data generation approaches (i.e., back-translation, knowledge distillation, and iterative in-domain knowledge transfer), advanced finetuning approaches and self-bleu based model ensemble. Our constrained Chinese→English system achieves 36.9 case-sensitive BLEU score, which is thehighest among all submissions.