PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation

Juncheng Wan, Jian Yang, Shuming Ma, Dongdong Zhang, Weinan Zhang, Yong Yu, Zhoujun Li


Abstract
While end-to-end neural machine translation (NMT) has achieved impressive progress, noisy input usually leads models to become fragile and unstable. Generating adversarial examples as the augmented data has been proved to be useful to alleviate this problem. Existing methods for adversarial example generation (AEG) are word-level or character-level, which ignore the ubiquitous phrase structure. In this paper, we propose a Phrase-level Adversarial Example Generation (PAEG) framework to enhance the robustness of the translation model. Our method further improves the gradient-based word-level AEG method by adopting a phrase-level substitution strategy. We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks. Experimental results demonstrate that our approach significantly improves translation performance and robustness to noise compared to previous strong baselines.
Anthology ID:
2022.coling-1.451
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
5085–5097
Language:
URL:
https://aclanthology.org/2022.coling-1.451
DOI:
Bibkey:
Cite (ACL):
Juncheng Wan, Jian Yang, Shuming Ma, Dongdong Zhang, Weinan Zhang, Yong Yu, and Zhoujun Li. 2022. PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5085–5097, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation (Wan et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.451.pdf