Yunzhi Yao


2023

pdf bib
Reasoning with Language Model Prompting: A Survey
Shuofei Qiao | Yixin Ou | Ningyu Zhang | Xiang Chen | Yunzhi Yao | Shumin Deng | Chuanqi Tan | Fei Huang | Huajun Chen
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Reasoning, as an essential ability for complex problem-solving, can provide back-end support for various real-world applications, such as medical diagnosis, negotiation, etc. This paper provides a comprehensive survey of cutting-edge research on reasoning with language model prompting. We introduce research works with comparisons and summaries and provide systematic resources to help beginners. We also discuss the potential reasons for emerging such reasoning abilities and highlight future research directions. Resources are available at https://github.com/zjunlp/Prompt4ReasoningPapers (updated periodically).

pdf bib
Editing Large Language Models
Ningyu Zhang | Yunzhi Yao | Shumin Deng
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics: Tutorial Abstract

pdf bib
Knowledge Rumination for Pre-trained Language Models
Yunzhi Yao | Peng Wang | Shengyu Mao | Chuanqi Tan | Fei Huang | Huajun Chen | Ningyu Zhang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Previous studies have revealed that vanilla pre-trained language models (PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus, several works have attempted to integrate external knowledge into PLMs. However, despite the promising outcome, we empirically observe that PLMs may have already encoded rich knowledge in their pre-trained parameters but fails to fully utilize them when applying to knowledge-intensive tasks. In this paper, we propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize that related latent knowledge without retrieving them from the external corpus. By simply adding a prompt like “As far as I know” to the PLMs, we try to review related latent knowledge and inject them back into the model for knowledge consolidation. We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3. Experimental results on six commonsense reasoning tasks and GLUE benchmarks demonstrate the effectiveness of our proposed approach, which proves that the knowledge stored in PLMs can be better exploited to enhance performance.

pdf bib
Editing Large Language Models: Problems, Methods, and Opportunities
Yunzhi Yao | Peng Wang | Bozhong Tian | Siyuan Cheng | Zhoubo Li | Shumin Deng | Huajun Chen | Ningyu Zhang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which is to alter the behavior of LLMs efficiently within a specific domain without negatively impacting performance across other inputs. This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. In particular, we provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. We also build a new benchmark dataset to facilitate a more robust evaluation and pinpoint enduring issues intrinsic to existing techniques. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context.

2022

pdf bib
Good Visual Guidance Make A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction
Xiang Chen | Ningyu Zhang | Lei Li | Yunzhi Yao | Shumin Deng | Chuanqi Tan | Fei Huang | Luo Si | Huajun Chen
Findings of the Association for Computational Linguistics: NAACL 2022

Multimodal named entity recognition and relation extraction (MNER and MRE) is a fundamental and crucial branch in information extraction. However, existing approaches for MNER and MRE usually suffer from error sensitivity when irrelevant object images incorporated in texts. To deal with these issues, we propose a novel Hierarchical Visual Prefix fusion NeTwork (HVPNeT) for visual-enhanced entity and relation extraction, aiming to achieve more effective and robust performance. Specifically, we regard visual representation as pluggable visual prefix to guide the textual representation for error insensitive forecasting decision. We further propose a dynamic gated aggregation strategy to achieve hierarchical multi-scaled visual features as visual prefix for fusion. Extensive experiments on three benchmark datasets demonstrate the effectiveness of our method, and achieve state-of-the-art performance.

2021

pdf bib
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains
Yunzhi Yao | Shaohan Huang | Wenhui Wang | Li Dong | Furu Wei
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021