Jiachang Liu
2023
Causal Intervention for Abstractive Related Work Generation
Jiachang Liu
|
Qi Zhang
|
Chongyang Shi
|
Usman Naseem
|
Shoujin Wang
|
Liang Hu
|
Ivor Tsang
Findings of the Association for Computational Linguistics: EMNLP 2023
Abstractive related work generation has attracted increasing attention in generating coherent related work that helps readers grasp the current research. However, most existing models ignore the inherent causality during related work generation, leading to spurious correlations which downgrade the models’ generation quality and generalizability. In this study, we argue that causal intervention can address such limitations and improve the quality and coherence of generated related work. To this end, we propose a novel Causal Intervention Module for Related Work Generation (CaM) to effectively capture causalities in the generation process. Specifically, we first model the relations among the sentence order, document (reference) correlations, and transitional content in related work generation using a causal graph. Then, to implement causal interventions and mitigate the negative impact of spurious correlations, we use do-calculus to derive ordinary conditional probabilities and identify causal effects through CaM. Finally, we subtly fuse CaM with Transformer to obtain an end-to-end related work generation framework. Extensive experiments on two real-world datasets show that CaM can effectively promote the model to learn causal relations and thus produce related work of higher quality and coherence.
2022
What Makes Good In-Context Examples for GPT-3?
Jiachang Liu
|
Dinghan Shen
|
Yizhe Zhang
|
Bill Dolan
|
Lawrence Carin
|
Weizhu Chen
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures
GPT-3 has attracted lots of attention due to its superior performance across a wide range of NLP tasks, especially with its in-context learning abilities. Despite its success, we found that the empirical results of GPT-3 depend heavily on the choice of in-context examples. In this work, we investigate whether there are more effective strategies for judiciously selecting in-context examples (relative to random sampling) that better leverage GPT-3’s in-context learning capabilities. Inspired by the recent success of leveraging a retrieval module to augment neural networks, we propose to retrieve examples that are semantically-similar to a test query sample to formulate its corresponding prompt. Intuitively, the examples selected with such a strategy may serve as more informative inputs to unleash GPT-3’s power of text generation. We evaluate the proposed approach on several natural language understanding and generation benchmarks, where the retrieval-based prompt selection approach consistently outperforms the random selection baseline. Moreover, it is observed that the sentence encoders fine-tuned on task-related datasets yield even more helpful retrieval results. Notably, significant gains are observed on tasks such as table-to-text generation (44.3% on the ToTTo dataset) and open-domain question answering (45.5% on the NQ dataset).
Search
Co-authors
- Qi Zhang 1
- Chongyang Shi 1
- Usman Naseem 1
- Shoujin Wang 1
- Liang Hu 1
- show all...