Weifeng Liu


2024

pdf bib
Adaptive Immune-based Sound-Shape Code Substitution for Adversarial Chinese Text Attacks
Ao Wang | Xinghao Yang | Chen Li | Bao-di Liu | Weifeng Liu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Adversarial textual examples reveal the vulnerability of natural language processing (NLP) models. Most existing text attack methods are designed for English text, while the robust implementation of the second popular language, i.e., Chinese with 1 billion users, is greatly underestimated. Although several Chinese attack methods have been presented, they either directly transfer from English attacks or adopt simple greedy search to optimize the attack priority, usually leading to unnatural sentences. To address these issues, we propose an adaptive Immune-based Sound-Shape Code (ISSC) algorithm for adversarial Chinese text attacks. Firstly, we leverage the Sound-Shape code to generate natural substitutions, which comprehensively integrate multiple Chinese features. Secondly, we employ adaptive immune algorithm (IA) to determine the replacement order, which can reduce the duplication of population to improve the search ability. Extensive experimental results validate the superiority of our ISSC in producing high-quality Chinese adversarial texts. Our code and data can be found in https://github.com/nohuma/chinese-attack-issc.

2022

pdf bib
Vega-MT: The JD Explore Academy Machine Translation System for WMT22
Changtong Zan | Keqin Peng | Liang Ding | Baopu Qiu | Boan Liu | Shwai He | Qingyu Lu | Zheng Zhang | Chuang Liu | Weifeng Liu | Yibing Zhan | Dacheng Tao
Proceedings of the Seventh Conference on Machine Translation (WMT)

We describe the JD Explore Academy’s submission of the WMT 2022 shared general translation task. We participated in all high-resource tracks and one medium-resource track, including Chinese-English, German-English, Czech-English, Russian-English, and Japanese-English. We push the limit of our previous work – bidirectional training for translation by scaling up two main factors, i.e. language pairs and model sizes, namely the Vega-MT system. As for language pairs, we scale the “bidirectional” up to the “multidirectional” settings, covering all participating languages, to exploit the common knowledge across languages, and transfer them to the downstream bilingual tasks. As for model sizes, we scale the Transformer-Big up to the extremely large model that owns nearly 4.7 Billion parameters, to fully enhance the model capacity for our Vega-MT. Also, we adopt the data augmentation strategies, e.g. cycle translation for monolingual data, and bidirectional self-training for bilingual and monolingual data, to comprehensively exploit the bilingual and monolingual data. To adapt our Vega-MT to the general domain test set, generalization tuning is designed. Based on the official automatic scores of constrained systems, in terms of the sacreBLEU shown in Figure-1, we got the 1st place on Zh-En (33.5), En-Zh (49.7), De-En (33.7), En-De (37.8), Cs-En (54.9), En-Cs (41.4) and En-Ru (32.7), 2nd place on Ru-En (45.1) and Ja-En (25.6), and 3rd place on En-Ja(41.5), respectively; W.R.T the COMET, we got the 1st place on Zh-En (45.1), En-Zh (61.7), De-En (58.0), En-De (63.2), Cs-En (74.7), Ru-En (64.9), En-Ru (69.6) and En-Ja (65.1), 2nd place on En-Cs (95.3) and Ja-En (40.6), respectively. Models will be released to facilitate the MT community through GitHub and OmniForce Platform.

pdf bib
On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation
Changtong Zan | Liang Ding | Li Shen | Yu Cao | Weifeng Liu | Dacheng Tao
Proceedings of the 29th International Conference on Computational Linguistics

Pre-Training (PT) of text representations has been successfully applied to low-resource Neural Machine Translation (NMT). However, it usually fails to achieve notable gains (some- times, even worse) on resource-rich NMT on par with its Random-Initialization (RI) counterpart. We take the first step to investigate the complementarity between PT and RI in resource-rich scenarios via two probing analyses, and find that: 1) PT improves NOT the accuracy, but the generalization by achieving flatter loss landscapes than that of RI; 2) PT improves NOT the confidence of lexical choice, but the negative diversity by assigning smoother lexical probability distributions than that of RI. Based on these insights, we propose to combine their complementarities with a model fusion algorithm that utilizes optimal transport to align neurons between PT and RI. Experiments on two resource-rich translation benchmarks, WMT’17 English-Chinese (20M) and WMT’19 English-German (36M), show that PT and RI could be nicely complementary to each other, achieving substantial improvements considering both translation accuracy, generalization, and negative diversity. Probing tools and code are released at: https://github.com/zanchangtong/PTvsRI.