Tianyue Peng

2025

pdf bib abs
Re-Cent: A Relation-Centric Framework for Joint Zero-Shot Relation Triplet Extraction
Zehan Li | Fu Zhang | Kailun Lyu | Jingwei Cheng | Tianyue Peng
Proceedings of the 31st International Conference on Computational Linguistics

Zero-shot Relation Triplet Extraction (ZSRTE) aims to extract triplets from the context where the relation patterns are unseen during training. Due to the inherent challenges of the ZSRTE task, existing extractive ZSRTE methods often decompose it into named entity recognition and relation classification, which overlooks the interdependence of two tasks and may introduce error propagation. Motivated by the intuition that crucial entity attributes might be implicit in the relation labels, we propose a Relation-Centric joint ZSRTE method named Re-Cent. This approach uses minimal information, specifically unseen relation labels, to extract triplets in one go through a unified model. We develop two span-based extractors to identify the subjects and objects corresponding to relation labels, forming span-pairs. Additionally, we introduce a relation-based correction mechanism that further refines the triplets by calculating the relevance between span-pairs and relation labels. Experiments demonstrate that Re-Cent achieves state-of-the-art performance with fewer parameters and does not rely on synthetic data or manual labor.

Large Language Models (LLMs) have shown impressive capabilities in language understanding and generation, leading to growing interest in zero-shot relation triplet extraction (ZeroRTE), a task that aims to extract triplets for unseen relations without annotated data. However, existing methods typically depend on costly fine-tuning and lack the structured semantic guidance required for accurate and interpretable extraction. To overcome these limitations, we propose FrameRTE, a novel ZeroRTE framework that adopts a “frame first, then extract” paradigm. Rather than extracting triplets directly, FrameRTE first constructs high-quality Relation Semantic Frames (RSFs) through a unified pipeline that integrates frame retrieval, synthesis, and enhancement. These RSFs serve as structured and interpretable knowledge scaffolds that guide frozen LLMs in the extraction process. Building upon these RSFs, we further introduce a human-inspired three-stage reasoning pipeline consisting of semantic frame evocation, frame-guided triplet extraction, and core frame elements validation to achieve semantically constrained extraction. Experiments demonstrate that FrameRTE achieves competitive zero-shot performance on multiple benchmarks. Moreover, the RSFs we construct serve as high-quality semantic resources that can enhance other extraction methods, showcasing the synergy between linguistic knowledge and foundation models.

pdf bib abs
Generation-Augmented Retrieval: Rethinking the Role of Large Language Models in Zero-Shot Relation Extraction
Zehan Li | Fu Zhang | Tianyue Peng | He Liu | Jingwei Cheng
Findings of the Association for Computational Linguistics: EMNLP 2025

Recent advances in Relation Extraction (RE) emphasize Zero-Shot methodologies, aiming to recognize unseen relations between entities with no annotated data. Although Large Language Models (LLMs) have demonstrated outstanding performance in many NLP tasks, their performance in Zero-Shot RE (ZSRE) without entity type constraints still lags behind Small Language Models (SLMs). LLM-based ZSRE often involves manual interventions and significant computational overhead, especially when scaling to large-scale multi-choice data.To this end, we introduce RE-GAR-AD, which not only leverages the generative capability of LLMs but also utilizes their representational power without tuning LLMs. We redefine LLM-based ZSRE as a retrieval challenge, utilizing a Generation-Augmented Retrieval framework coupled with a retrieval Adjuster. Specifically, our approach guides LLMs through crafted prompts to distill sentence semantics and enrich relation labels. We encode sentences and relation labels using LLMs and match their embeddings in a triplet fashion. This retrieval technique significantly reduces token input requirements. Additionally, to further optimize embeddings, we propose a plug-in retrieval adjuster with only 2M parameters, which allows rapid fine-tuning without accessing LLMs’ parameters. Our LLM-based model demonstrates comparable performance on multiple benchmarks.

Co-authors

He Liu 1

Kailun Lyu 1

Wenqing Zhang 1

Venues

Fix author