Yubo Feng

2025

This paper focuses on the task of generating concept sememe trees to study whether Large Language Models (LLMs) can understand and generate domain conceptual knowledge. Concept sememe tree is a hierarchical structure that represents lexical meaning by combining sememes and their relationships.To this end, we introduce the Neighbor Semantic Structure (NSS) and Chain-of-Thought (CoT) prompting method to evaluate the effectiveness of various LLMs in generating accurate and comprehensive sememe trees across different domains. The NSS, guided by conceptual metaphors, identifies terms that exhibit significant external systematicity within a hierarchical relational network and incorporates them as examples in the learning process of LLMs. Meanwhile, the CoT prompting method guides LLMs through a systematic analysis of a term’s intrinsic core concepts, essential attributes, and semantic relationships, enabling the generation of concept sememe trees.We conduct experiments using datasets drawn from four authoritative terminology manuals and evaluate different LLMs. The experimental results indicate that LLMs possess the capability to capture and represent the conceptual knowledge aspects of domain-specific terms. Moreover, the integration of NSS examples with a structured CoT process allows LLMs to explore domain conceptual knowledge more profoundly, leading to the generation of highly accurate concept sememe trees.

pdf bib abs
GDLLM: A Global Distance-aware Modeling Approach Based on Large Language Models for Event Temporal Relation Extraction
Jie Zhao | Wanting Ning | Yuxiao Fei | Yubo Feng | Lishuang Li
Findings of the Association for Computational Linguistics: EMNLP 2025

In Natural Language Processing(NLP), Event Temporal Relation Extraction (ETRE) is to recognize the temporal relations of two events. Prior studies have noted the importance of language models for ETRE. However, the restricted pre-trained knowledge of Small Language Models(SLMs) limits their capability to handle minority class relations in imbalanced classification datasets. For Large Language Models(LLMs), researchers adopt manually designed prompts or instructions, which may introduce extra noise, leading to interference with the model’s judgment of the long-distance dependencies between events. To address these issues, we propose GDLLM, a Global Distance-aware modeling approach based on LLMs. We first present a distance-aware graph structure utilizing Graph Attention Network(GAT) to assist the LLMs in capturing long-distance dependency features. Additionally, we design a temporal feature learning paradigm based on soft inference to augment the identification of relations with a short-distance proximity band, which supplements the probabilistic information generated by LLMs into the multi-head attention mechanism. Since the global feature can be captured effectively, our framework substantially enhances the performance of minority relation classes and improves the overall learning ability. Experiments on two publicly available datasets, TB-Dense and MATRES, demonstrate that our approach achieves state-of-the-art (SOTA) performance.

pdf bib abs
Rule-Guided Extraction: A Hierarchical Rule Optimization Framework for Document-Level Event Argument Extraction
Yue Zuo | Yuxiao Fei | Wanting Ning | Jiayi Huang | Yubo Feng | Lishuang Li
Findings of the Association for Computational Linguistics: EMNLP 2025

Document-level event argument extraction (EAE) is a critical task in natural language processing. While most prior approaches rely on supervised training with large labeled datasets or resource-intensive fine-tuning, recent studies explore in-context learning (ICL) with LLMs to reduce data dependence and training costs. However, the performance of ICL-based methods still lags behind fully supervised models.We highlight a key reason for this shortfall: the lack of sufficient extraction rules. In this paper, we conduct a systematic study of using hierarchical rules to enhance LLMs’ ICL capabilities. We first define three types of hierarchical rules and demonstrate their effectiveness in enhancing the performance of LLMs for document-level EAE. Building on this, we further propose an LLM-driven HiErarchical Rule Optimization (HERO) framework that iteratively generates and selects optimal hierarchical rules. Specifically, in each iteration, high-value instances are selected to produce error feedback, which is used to update and expand hierarchical rule sets. This results in multiple candidate hierarchical rule sets, from which the optimal one is selected using a scoring-based mechanism. During inference, prompts are constructed using the optimal hierarchical rules to enhance ICL performance of LLMs. Extensive experiments demonstrate the effectiveness of HERO, surpassing few-shot supervised methods and outperforming state-of-the-art prompting baselines by 3.18% F1 on RAMS, 4.30% F1 on DocEE-N, and 3.17% F1 on DocEE-C.

2024

“Biomedical Event Causal Relation Extraction (BECRE) is an important task in biomedical infor-mation extraction. Existing methods usually use pre-trained language models to learn semanticrepresentations and then predict the event causal relation. However, these methods struggle tocapture sufficient cues in biomedical texts for predicting causal relations. In this paper, we pro-pose a Path Reasoning-based Relation-aware Network (PRRN) to explore deeper cues for causalrelations using reinforcement learning. Specifically, our model reasons the relation paths betweenentity arguments of two events, namely entity relation path, which connects the two biomedicalevents through the multi-hop interactions between entities to provide richer cues for predictingevent causal relations. In PRRN, we design a path reasoning module based on reinforcementlearning and propose a novel reward function to encourage the model to focus on the length andcontextual relevance of entity relation paths. The experimental results on two datasets suggestthat PRRN brings considerable improvements over the state-of-the-art models.Introduction”

pdf bib abs
Triple-view Event Hierarchy Model for Biomedical Event Representation
Jiayi Huang | Lishuang Li | Xueyang Qin | Yi Xiang | Jiaqi Li | Yubo Feng
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)

“Biomedical event representation can be applied to various language tasks. A biomedical eventoften involves multiple biomedical entities and trigger words, and the event structure is complex.However, existing research on event representation mainly focuses on the general domain. Ifmodels from the general domain are directly transferred to biomedical event representation, theresults may not be satisfactory. We argue that biomedical events can be divided into three hierar-chies, each containing unique feature information. Therefore, we propose the Triple-views EventHierarchy Model (TEHM) to enhance the quality of biomedical event representation. TEHM ex-tracts feature information from three different views and integrates them. Specifically, due to thecomplexity of biomedical events, We propose the Trigger-aware Aggregator module to handlecomplex units within biomedical events. Additionally, we annotate two similarity task datasetsin the biomedical domain using annotation standards from the general domain. Extensive exper-iments demonstrate that TEHM achieves state-of-the-art performance on biomedical similaritytasks and biomedical event casual relation extraction.Introduction”

pdf bib abs
Temporal Cognitive Tree: A Hierarchical Modeling Approach for Event Temporal Relation Extraction
Wanting Ning | Lishuang Li | Xueyang Qin | Yubo Feng | Jingyao Tang
Findings of the Association for Computational Linguistics: EMNLP 2024

Understanding and analyzing event temporal relations is a crucial task in Natural Language Processing (NLP). This task, known as Event Temporal Relation Extraction (ETRE), aims to identify and extract temporal connections between events in text. Recent studies focus on locating the relative position of event pairs on the timeline by designing logical expressions or auxiliary tasks to predict their temporal occurrence. Despite these advances, this modeling approach neglects the multidimensional information in temporal relation and the hierarchical process of reasoning. In this study, we propose a novel hierarchical modeling approach for this task by introducing a Temporal Cognitive Tree (TCT) that mimics human logical reasoning. Additionally, we also design a integrated model incorporating prompt optimization and deductive reasoning to exploit multidimensional supervised information. Extensive experiments on TB-Dense and MATRES datasets demonstrate that our approach outperforms existing methods.

pdf bib abs
Event Representation Learning with Multi-Grained Contrastive Learning and Triple-Mixture of Experts
Tianqi Hu | Lishuang Li | Xueyang Qin | Yubo Feng
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Event representation learning plays a crucial role in numerous natural language processing (NLP) tasks, as it facilitates the extraction of semantic features associated with events. Current methods of learning event representation based on contrastive learning processes positive examples with single-grain random masked language model (MLM), but fall short in learn information inside events from multiple aspects. In this paper, we introduce multi-grained contrastive learning and triple-mixture of experts (MCTM) for event representation learning. Our proposed method extends the random MLM by incorporating a specialized MLM designed to capture different grammatical structures within events, which allows the model to learn token-level knowledge from multiple perspectives. Furthermore, we have observed that mask tokens with different granularities affect the model differently, therefore, we incorporate mixture of experts (MoE) to learn importance weights associated with different granularities. Our experiments demonstrate that MCTM outperforms other baselines in tasks such as hard similarity and transitive sentence similarity, highlighting the superiority of our method.