Ao Wang


2024

pdf bib
Adaptive Immune-based Sound-Shape Code Substitution for Adversarial Chinese Text Attacks
Ao Wang | Xinghao Yang | Chen Li | Bao-di Liu | Weifeng Liu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Adversarial textual examples reveal the vulnerability of natural language processing (NLP) models. Most existing text attack methods are designed for English text, while the robust implementation of the second popular language, i.e., Chinese with 1 billion users, is greatly underestimated. Although several Chinese attack methods have been presented, they either directly transfer from English attacks or adopt simple greedy search to optimize the attack priority, usually leading to unnatural sentences. To address these issues, we propose an adaptive Immune-based Sound-Shape Code (ISSC) algorithm for adversarial Chinese text attacks. Firstly, we leverage the Sound-Shape code to generate natural substitutions, which comprehensively integrate multiple Chinese features. Secondly, we employ adaptive immune algorithm (IA) to determine the replacement order, which can reduce the duplication of population to improve the search ability. Extensive experimental results validate the superiority of our ISSC in producing high-quality Chinese adversarial texts. Our code and data can be found in https://github.com/nohuma/chinese-attack-issc.

2019

pdf bib
Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generation
Jiangjie Chen | Ao Wang | Haiyun Jiang | Suo Feng | Chenguang Li | Yanghua Xiao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

A type description is a succinct noun compound which helps human and machines to quickly grasp the informative and distinctive information of an entity. Entities in most knowledge graphs (KGs) still lack such descriptions, thus calling for automatic methods to supplement such information. However, existing generative methods either overlook the grammatical structure or make factual mistakes in generated texts. To solve these problems, we propose a head-modifier template based method to ensure the readability and data fidelity of generated type descriptions. We also propose a new dataset and two metrics for this task. Experiments show that our method improves substantially compared with baselines and achieves state-of-the-art performance on both datasets.