Sizhe Zhou


2024

pdf bib
Grasping the Essentials: Tailoring Large Language Models for Zero-Shot Relation Extraction
Sizhe Zhou | Yu Meng | Bowen Jin | Jiawei Han
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Relation extraction (RE) aims to identify semantic relationships between entities within text. Despite considerable advancements, existing models predominantly require extensive annotated training data, which is both costly and labor-intensive to collect. Moreover, these models often struggle to adapt to new or unseen relations. Few-shot learning, aiming to lessen annotation demands, typically provides incomplete and biased supervision for target relations, leading to degraded and unstable performance. To accurately and explicitly describe relation semantics while minimizing annotation demands, we explore the definition only zero-shot RE setting where only relation definitions expressed in natural language are used to train a RE model. We introduce REPaL, comprising three stages: (1) We leverage large language models (LLMs) to generate initial seed instances from relation definitions and an unlabeled corpus. (2) We fine-tune a bidirectional Small Language Model (SLM) with initial seeds to learn relations for the target domain. (3) We expand pattern coverage and mitigate bias from initial seeds by integrating feedback from the SLM’s predictions on the unlabeled corpus and the synthesis history. To accomplish this, we leverage the multi-turn conversation ability of LLMs to generate new instances in follow-up dialogues, informed by both the feedback and synthesis history. Studies reveal that definition-oriented seed synthesis enhances pattern coverage whereas indiscriminately increasing seed quantity leads to performance saturation. Experiments on two datasets show REPaL significantly improved cost-effective zero-shot performance by large margins.

pdf bib
Topic-Oriented Open Relation Extraction with A Priori Seed Generation
Linyi Ding | Jinfeng Xiao | Sizhe Zhou | Chaoqi Yang | Jiawei Han
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

The field of open relation extraction (ORE) has recently observed significant advancement thanks to the growing capability of large language models (LLMs). Nevertheless, challenges persist when ORE is performed on specific topics. Existing methods give sub-optimal results in five dimensions: factualness, topic relevance, informativeness, coverage, and uniformity. To improve topic-oriented ORE, we propose a zero-shot approach called PriORE: Open Relation Extraction with a Priori seed generation. PriORE leverages the built-in knowledge of LLMs to maintain a dynamic seed relation dictionary for the topic. The dictionary is initialized by seed relations generated from topic-relevant entity types and expanded during contextualized ORE. PriORE then reduces the randomness in generative ORE by converting it to a more robust relation classification task. Experiments show the approach empowers better topic-oriented control over the generated relations and thus improves ORE performance along the five dimensions, especially on specialized and narrow topics.

pdf bib
Text2DB: Integration-Aware Information Extraction with Large Language Model Agents
Yizhu Jiao | Sha Li | Sizhe Zhou | Heng Ji | Jiawei Han
Findings of the Association for Computational Linguistics: ACL 2024

The task of information extraction (IE) is to extract structured knowledge from text. However, it is often not straightforward to utilize IE output due to the mismatch between the IE ontology and the downstream application needs. We propose a new formulation of IE, Text2DB, that emphasizes the integration of IE output and the target database (or knowledge base). Given a user instruction, a document set, and a database, our task requires the model to update the database with values from the document set to satisfy the user instruction. This task requires understanding user instructions for what to extract and adapting to the given DB/KB schema for how to extract on the fly. To evaluate this new task, we introduce a new benchmark featuring common demands such as data infilling, row population, and column addition. In addition, we propose an LLM agent framework OPAL (Observe-Plan-Analyze LLM) which includes an Observer component that interacts with the database, the Planner component that generates a code-based plan with calls to IE models, and the Analyzer component that provides feedback regarding code quality before execution. Experiments show that OPAL can successfully adapt to diverse database schemas by generating different code plans and calling the required IE models. We also highlight difficult cases such as dealing with large databases with complex dependencies and extraction hallucination, which we believe deserve further investigation.

2023

pdf bib
Towards End-to-End Open Conversational Machine Reading
Sizhe Zhou | Siru Ouyang | Zhuosheng Zhang | Hai Zhao
Findings of the Association for Computational Linguistics: EACL 2023

In open-retrieval conversational machine reading (OR-CMR) task, machines are required to do multi-turn question answering given dialogue history and a textual knowledge base. Existing works generally utilize two independent modules to approach this problem’s two successive sub-tasks: first with a hard-label decision making and second with a question generation aided by various entailment reasoning methods. Such usual cascaded modeling is vulnerable to error propagation and prevents the two sub-tasks from being consistently optimized. In this work, we instead model OR-CMR as a unified text-to-text task in a fully end-to-end style. Experiments on the ShARC and OR-ShARC dataset show the effectiveness of our proposed end-to-end framework on both sub-tasks by a large margin, achieving new state-of-the-art results. Further ablation studies support that our framework can generalize to different backbone models.