Large-scale pretrained language models have achieved SOTA results on NLP tasks. However, they have been shown vulnerable to adversarial attacks especially for logographic languages like Chinese. In this work, we propose RoCBert: a pretrained Chinese Bert that is robust to various forms of adversarial attacks like word perturbation, synonyms, typos, etc. It is pretrained with the contrastive learning objective which maximizes the label consistency under different synthesized adversarial examples. The model takes as input multimodal information including the semantic, phonetic and visual features. We show all these features areimportant to the model robustness since the attack can be performed in all the three forms. Across 5 Chinese NLU tasks, RoCBert outperforms strong baselines under three blackbox adversarial algorithms without sacrificing the performance on clean testset. It also performs the best in the toxic content detection task under human-made attacks.
Prompt-based paradigm has shown its competitive performance in many NLP tasks. However, its success heavily depends on prompt design, and the effectiveness varies upon the model and training data. In this paper, we propose a novel dual context-guided continuous prompt (DCCP) tuning method. To explore the rich contextual information in language structure and close the gap between discrete prompt tuning and continuous prompt tuning, DCCP introduces two auxiliary training objectives and constructs input in a pair-wise fashion.Experimental results demonstrate that our method is applicable to many NLP tasks, and can often outperform existing prompt tuning methods by a large margin in the few-shot setting.
Neural network-based sequence-to-sequence (seq2seq) models strongly suffer from the low-diversity problem when it comes to open-domain dialogue generation. As bland and generic utterances usually dominate the frequency distribution in our daily chitchat, avoiding them to generate more interesting responses requires complex data filtering, sampling techniques or modifying the training objective. In this paper, we propose a new perspective to diversify dialogue generation by leveraging non-conversational text. Compared with bilateral conversations, non-conversational text are easier to obtain, more diverse and cover a much broader range of topics. We collect a large-scale non-conversational corpus from multi sources including forum comments, idioms and book snippets. We further present a training paradigm to effectively incorporate these text via iterative back translation. The resulting model is tested on two conversational datasets from different domains and is shown to produce significantly more diverse responses without sacrificing the relevance with context.
Being able to perform in-depth chat with humans in a closed domain is a precondition before an open-domain chatbot can be ever claimed. In this work, we take a close look at the movie domain and present a large-scale high-quality corpus with fine-grained annotations in hope of pushing the limit of movie-domain chatbots. We propose a unified, readily scalable neural approach which reconciles all subtasks like intent prediction and knowledge retrieval. The model is first pretrained on the huge general-domain data, then finetuned on our corpus. We show this simple neural approach trained on high-quality data is able to outperform commercial systems replying on complex rules. On both the static and interactive tests, we find responses generated by our system exhibits remarkably good engagement and sensibleness close to human-written ones. We further analyze the limits of our work and point out potential directions for future work