Yeshuang Zhu

2022

AutoCAD: Automatically Generate Counterfactuals for Mitigating Shortcut Learning
Jiaxin Wen | Yeshuang Zhu | Jinchao Zhang | Jie Zhou | Minlie Huang
Findings of the Association for Computational Linguistics: EMNLP 2022

Recent studies have shown the impressive efficacy of counterfactually augmented data (CAD) for reducing NLU models’ reliance on spurious features and improving their generalizability. However, current methods still heavily rely on human efforts or task-specific designs to generate counterfactuals, thereby impeding CAD’s applicability to a broad range of NLU tasks. In this paper, we present AutoCAD, a fully automatic and task-agnostic CAD generation framework. AutoCAD first leverages a classifier to unsupervisedly identify rationales as spans to be intervened, which disentangles spurious and causal features. Then, AutoCAD performs controllable generation enhanced by unlikelihood training to produce diverse counterfactuals. Extensive evaluations on multiple out-of-domain and challenge benchmarks demonstrate that AutoCAD consistently and significantly boosts the out-of-distribution performance of powerful pre-trained models across different NLU tasks, which is comparable or even better than previous state-of-the-art human-in-the-loop or task-specific CAD methods.

pdf bib abs

Selecting Stickers in Open-Domain Dialogue through Multitask Learning
Zhexin Zhang | Yeshuang Zhu | Zhengcong Fei | Jinchao Zhang | Jie Zhou
Findings of the Association for Computational Linguistics: ACL 2022

With the increasing popularity of online chatting, stickers are becoming important in our online communication. Selecting appropriate stickers in open-domain dialogue requires a comprehensive understanding of both dialogues and stickers, as well as the relationship between the two types of modalities. To tackle these challenges, we propose a multitask learning method comprised of three auxiliary tasks to enhance the understanding of dialogue history, emotion and semantic meaning of stickers. Extensive experiments conducted on a recent challenging dataset show that our model can better combine the multimodal information and achieve significantly higher accuracy over strong baselines. Ablation study further verifies the effectiveness of each auxiliary task. Our code is available at https://github.com/nonstopfor/Sticker-Selection.

2021

pdf bib

Toward Fully Exploiting Heterogeneous Corpus:A Decoupled Named Entity Recognition Model with Two-stage Training
Yun Hu | Yeshuang Zhu | Jinchao Zhang | Changwen Zheng | Jie Zhou
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

Co-authors

Venues

Findings3

Fix author