Changqun Li


2024

pdf bib
Hypernetwork-Assisted Parameter-Efficient Fine-Tuning with Meta-Knowledge Distillation for Domain Knowledge Disentanglement
Changqun Li | Linlin Wang | Xin Lin | Shizhou Huang | Liang He
Findings of the Association for Computational Linguistics: NAACL 2024

Domain adaptation from labeled source domains to the target domain is important in practical summarization scenarios. However, the key challenge is domain knowledge disentanglement. In this work, we explore how to disentangle domain-invariant knowledge from source domains while learning specific knowledge of the target domain. Specifically, we propose a hypernetwork-assisted encoder-decoder architecture with parameter-efficient fine-tuning. It leverages a hypernetwork instruction learning module to generate domain-specific parameters from the encoded inputs accompanied by task-related instruction. Further, to better disentangle and transfer knowledge from source domains to the target domain, we introduce a meta-knowledge distillation strategy to build a meta-teacher model that captures domain-invariant knowledge across multiple domains and use it to transfer knowledge to students. Experiments on three dialogue summarization datasets show the effectiveness of the proposed model. Human evaluations also show the superiority of our model with regard to the summary generation quality.

pdf bib
MNER-MI: A Multi-image Dataset for Multimodal Named Entity Recognition in Social Media
Shizhou Huang | Bo Xu | Changqun Li | Jiabo Ye | Xin Lin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Recently, multimodal named entity recognition (MNER) has emerged as a vital research area within named entity recognition. However, current MNER datasets and methods are predominantly based on text and a single accompanying image, leaving a significant research gap in MNER scenarios involving multiple images. To address the critical research gap and enhance the scope of MNER for real-world applications, we propose a novel human-annotated MNER dataset with multiple images called MNER-MI. Additionally, we construct a dataset named MNER-MI-Plus, derived from MNER-MI, to ensure its generality and applicability. Based on these datasets, we establish a comprehensive set of strong and representative baselines and we further propose a simple temporal prompt model with multiple images to address the new challenges in multi-image scenarios. We have conducted extensive experiments to demonstrate that considering multiple images provides a significant improvement over a single image and can offer substantial benefits for MNER. Furthermore, our proposed method achieves state-of-the-art results on both MNER-MI and MNER-MI-Plus, demonstrating its effectiveness. The datasets and source code can be found at https://github.com/JinFish/MNER-MI.

2022

pdf bib
Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization
Changqun Li | Linlin Wang | Xin Lin | Gerard de Melo | Liang He
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Succinctly summarizing dialogue is a task of growing interest, but inherent challenges, such as insufficient training data and low information density impede our ability to train abstractive models. In this work, we propose a novel curriculum-based prompt learning method with self-training to address these problems. Specifically, prompts are learned using a curriculum learning strategy that gradually increases the degree of prompt perturbation, thereby improving the dialogue understanding and modeling capabilities of our model. Unlabeled dialogue is incorporated by means of self-training so as to reduce the dependency on labeled data. We further investigate topic-aware prompts to better plan for the generation of summaries. Experiments confirm that our model substantially outperforms strong baselines and achieves new state-of-the-art results on the AMI and ICSI datasets. Human evaluations also show the superiority of our model with regard to the summary generation quality.