Junda Wu


pdf bib
Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer in Prompt Tuning
Kaige Xie | Tong Yu | Haoliang Wang | Junda Wu | Handong Zhao | Ruiyi Zhang | Kanak Mahadik | Ani Nenkova | Mark Riedl
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

In real-world scenarios, labeled samples for dialogue summarization are usually limited (i.e., few-shot) due to high annotation costs for high-quality dialogue summaries. To efficiently learn from few-shot samples, previous works have utilized massive annotated data from other downstream tasks and then performed prompt transfer in prompt tuning so as to enable cross-task knowledge transfer. However, existing general-purpose prompt transfer techniques lack consideration for dialogue-specific information. In this paper, we focus on improving the prompt transfer from dialogue state tracking to dialogue summarization and propose Skeleton-Assisted Prompt Transfer (SAPT), which leverages skeleton generation as extra supervision that functions as a medium connecting the distinct source and target task and resulting in the model’s better consumption of dialogue state information. To automatically extract dialogue skeletons as supervised training data for skeleton generation, we design a novel approach with perturbation-based probes requiring neither annotation effort nor domain knowledge. Training the model on such skeletons can also help preserve model capability during prompt transfer. Our method significantly outperforms existing baselines. In-depth analyses demonstrate the effectiveness of our method in facilitating cross-task knowledge transfer in few-shot dialogue summarization.


pdf bib
Federated Domain Adaptation for Named Entity Recognition via Distilling with Heterogeneous Tag Sets
Rui Wang | Tong Yu | Junda Wu | Handong Zhao | Sungchul Kim | Ruiyi Zhang | Subrata Mitra | Ricardo Henao
Findings of the Association for Computational Linguistics: ACL 2023

Federated learning involves collaborative training with private data from multiple platforms, while not violating data privacy. We study the problem of federated domain adaptation for Named Entity Recognition (NER), where we seek to transfer knowledge across different platforms with data of multiple domains. In addition, we consider a practical and challenging scenario, where NER datasets of different platforms of federated learning are annotated with heterogeneous tag sets, i.e., different sets of entity types. The goal is to train a global model with federated learning, such that it can predict with a complete tag set, i.e., with all the occurring entity types for data across all platforms. To cope with the heterogeneous tag sets in a multi-domain setting, we propose a distillation approach along with a mechanism of instance weighting to facilitate knowledge transfer across platforms. Besides, we release two re-annotated clinic NER datasets, for testing the proposed method in the clinic domain. Our method shows superior empirical performance for NER with federated learning.


pdf bib
Context-aware Information-theoretic Causal De-biasing for Interactive Sequence Labeling
Junda Wu | Rui Wang | Tong Yu | Ruiyi Zhang | Handong Zhao | Shuai Li | Ricardo Henao | Ani Nenkova
Findings of the Association for Computational Linguistics: EMNLP 2022

Supervised training of existing deep learning models for sequence labeling relies on large scale labeled datasets. Such datasets are generally created with crowd-source labeling. However, crowd-source labeling for tasks of sequence labeling can be expensive and time-consuming. Further, crowd-source labeling by external annotators may not be appropriate for data that contains user private information. Considering the above limitations of crowd-source labeling, we study interactive sequence labeling that allows training directly with the user feedback, which alleviates the annotation cost and maintains the user privacy. We identify two bias, namely, context bias and feedback bias, by formulating interactive sequence labeling via a Structural Causal Model (SCM). To alleviate the context and feedback bias based on the SCM, we identify the frequent context tokens as confounders in the backdoor adjustment and further propose an entropy-based modulation that is inspired by information theory. entities more sample-efficiently. With extensive experiments, we validate that our approach can effectively alleviate the biases and our models can be efficiently learnt with the user feedback.