Zehan Li


2025

pdf bib
Re-Cent: A Relation-Centric Framework for Joint Zero-Shot Relation Triplet Extraction
Zehan Li | Fu Zhang | Kailun Lyu | Jingwei Cheng | Tianyue Peng
Proceedings of the 31st International Conference on Computational Linguistics

Zero-shot Relation Triplet Extraction (ZSRTE) aims to extract triplets from the context where the relation patterns are unseen during training. Due to the inherent challenges of the ZSRTE task, existing extractive ZSRTE methods often decompose it into named entity recognition and relation classification, which overlooks the interdependence of two tasks and may introduce error propagation. Motivated by the intuition that crucial entity attributes might be implicit in the relation labels, we propose a Relation-Centric joint ZSRTE method named Re-Cent. This approach uses minimal information, specifically unseen relation labels, to extract triplets in one go through a unified model. We develop two span-based extractors to identify the subjects and objects corresponding to relation labels, forming span-pairs. Additionally, we introduce a relation-based correction mechanism that further refines the triplets by calculating the relevance between span-pairs and relation labels. Experiments demonstrate that Re-Cent achieves state-of-the-art performance with fewer parameters and does not rely on synthetic data or manual labor.

pdf bib
CE-DA: Custom Embedding and Dynamic Aggregation for Zero-Shot Relation Extraction
Fu Zhang | He Liu | Zehan Li | Jingwei Cheng
Proceedings of the 31st International Conference on Computational Linguistics

Zero-shot Relation Extraction (ZSRE) aims to predict novel relations from sentences with given entity pairs, where the relations have not been encountered during training. Prototypebased methods, which achieve ZSRE by aligning the sentence representation and the relation prototype representation, have shown great potential. However, most existing works focus solely on improving the quality of prototype representations, neglecting sentence representations and lacking interaction between different types of relation side information. In this paper, we propose a novel ZSRE framework named CE-DA, which includes two modules: Custom Embedding and Dynamic Aggregation. We employ a two-stage approach to obtain customized embeddings of sentences. In the first stage, we train a sentence encoder through unsupervised contrastive learning, and in the second stage, we highlight the potential relations between entities in sentences using carefully designed entity emphasis prompts to further enhance sentence representations. Additionally, our dynamic aggregation method assigns different weights to different types of relation side information through a learnable network to enhance the quality of relation prototype representations. In contrast to traditional methods that treat the importance of all side information equally, our dynamic aggregation method further strengthen the interaction between different types of relation side information. Our method demonstrates competitive performance across various metrics on two ZSRE datasets.

2024

pdf bib
AlignRE: An Encoding and Semantic Alignment Approach for Zero-Shot Relation Extraction
Zehan Li | Fu Zhang | Jingwei Cheng
Findings of the Association for Computational Linguistics: ACL 2024

Zero-shot Relation Extraction (ZSRE) aims to predict unseen relations between entity pairs from input sentences. Existing prototype-based ZSRE methods encode relation descriptions into prototype embeddings and predict by measuring the similarity between sentence embeddings and prototype embeddings. However, these methods often overlook abundant side information of relations and suffer from a significant encoding gap between prototypes and sentences, limiting performance. To this end, we propose a framework named AlignRE, based on two Alignment methods for ZSRE. Specifically, we present a novel perspective centered on encoding schema alignment to enhance prototype-based ZSRE methods. We utilize well-designed prompt-tuning to bridge the encoding gap. To improve prototype quality, we explore and leverage multiple side information and propose a prototype aggregation method based on semantic alignment to create comprehensive relation prototype representations. We conduct experiments on FewRel and Wiki-ZSL datasets and consistently outperform state-of-the-art methods. Moreover, our method exhibits substantially faster performance and reduces the need for extensive manual labor in prototype construction. Code is available at https://github.com/lizehan1999/AlignRE.

pdf bib
ProCQA: A Large-scale Community-based Programming Question Answering Dataset for Code Search
Zehan Li | Jianfei Zhang | Chuantao Yin | Yuanxin Ouyang | Wenge Rong
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Retrieval-based code question answering seeks to match user queries in natural language to relevant code snippets. Previous approaches typically rely on pretraining models using crafted bi-modal and uni-modal datasets to align text and code representations. In this paper, we introduce ProCQA, a large-scale programming question answering dataset extracted from the StackOverflow community, offering naturally structured mixed-modal QA pairs. To validate its effectiveness, we propose a modality-agnostic contrastive pre-training approach to improve the alignment of text and code representations of current code language models. Compared to previous models that primarily employ bimodal and unimodal pairs extracted from CodeSearchNet for pre-training, our model exhibits significant performance improvements across a wide range of code retrieval benchmarks.

2023

pdf bib
Text Representation Distillation via Information Bottleneck Principle
Yanzhao Zhang | Dingkun Long | Zehan Li | Pengjun Xie
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Pre-trained language models (PLMs) have recently shown great success in text representation field. However, the high computational cost and high-dimensional representation of PLMs pose significant challenges for practical applications. To make models more accessible, an effective method is to distill large models into smaller representation models. In order to relieve the issue of performance degradation after distillation, we propose a novel Knowledge Distillation method called IBKD. This approach is motivated by the Information Bottleneck principle and aims to maximize the mutual information between the final representation of the teacher and student model, while simultaneously reducing the mutual information between the student model’s representation and the input data. This enables the student model to preserve important learned information while avoiding unnecessary information, thus reducing the risk of over-fitting. Empirical studies on two main downstream applications of text representation (Semantic Textual Similarity and Dense Retrieval tasks) demonstrate the effectiveness of our proposed approach.