Yujie Wang

Also published as: YuJie Wang

Papers on this page may belong to the following people: Yujie Wang (誉杰王), Yujie Wang

2025

Event Causality Identification (ECI) aims to identify fine-grained causal relationships between events in an unstructured text. Existing ECI methods primarily rely on knowledge enhanced and graph-based reasoning approaches, but they often overlook the dependencies between similar events. Additionally, the connection between unstructured text and structured knowledge is relatively weak. Therefore, this paper proposes an ECI method enhanced by LLM Knowledge and Concept-Level Event Relations (LKCER). Specifically, LKCER constructs a conceptual-level heterogeneous event graph by leveraging the local contextual information of related event mentions, generating a more comprehensive global semantic representation of event concepts. At the same time, the knowledge generated by COMET is filtered and enriched using LLM, strengthening the associations between event pairs and knowledge. Finally, the joint event conceptual representation and knowledge-enhanced event representation are used to uncover potential causal relationships between events. The experimental results show that our method outperforms previous state-of-the-art methods on both benchmarks, EventStoryLine and Causal-TimeBank.

pdf bib abs

Event Causal Identification (ECI) aims to identify fine-grained causal relationships between events from unstructured text. Contrastive learning has shown promise in enhancing ECI by optimizing representation distances between positive and negative samples. However, existing methods often rely on rule-based or random sampling strategies, which may introduce spurious causal positives. Moreover, static negative samples often fail to approximate actual decision boundaries, thus limiting discriminative performance. Therefore, we propose an ECI method enhanced by Dynamic Energy-based Contrastive Learning with multi-stage knowledge Verification (DECLV). Specifically, we integrate multi-source knowledge validation and LLM-driven causal inference to construct a multi-stage knowledge validation mechanism, which generates high-quality contrastive samples and effectively suppresses spurious causal disturbances. Meanwhile, we introduce the Stochastic Gradient Langevin Dynamics (SGLD) method to dynamically generate adversarial negative samples, and employ an energy-based function to model the causal boundary between positive and negative samples. The experimental results show that our method outperforms previous state-of-the-art methods on both benchmarks, EventStoryLine and Causal-TimeBank.

2024

pdf bib abs

Structured entailment tree can exhibit the reasoning chains from knowledge facts to predicted answers, which is important for constructing an explainable question answering system. Existing works mainly include directly generating the entire tree and stepwise generating the proof steps. The stepwise methods can exploit combinatoriality and generalize to longer steps, but they have large fact search spaces and error accumulation problems resulting in the generation of invalid steps. In this paper, inspired by the Dual Process Theory in cognitive science, we propose FRVA, a Fact-Retrieval and Verification Augmented bidirectional entailment tree generation method that contains two systems. Specifically, System 1 makes intuitive judgments through the fact retrieval module and filters irrelevant facts to reduce the search space. System 2 designs a deductive-abductive bidirectional reasoning module, and we construct cross-verification and multi-view contrastive learning to make the generated proof steps closer to the target hypothesis. We enhance the reliability of the stepwise proofs to mitigate error propagation. Experiment results on EntailmentBank show that FRVA outperforms previous models and achieves state-of-the-art performance in fact selection and structural correctness.

pdf bib abs

Hyperspherical Multi-Prototype with Optimal Transport for Event Argument Extraction
Guangjun Zhang | Hu Zhang | YuJie Wang | Ru Li | Hongye Tan | Jiye Liang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Event Argument Extraction (EAE) aims to extract arguments for specified events from a text. Previous research has mainly focused on addressing long-distance dependencies of arguments, modeling co-occurrence relationships between roles and events, but overlooking potential inductive biases: (i) semantic differences among arguments of the same type and (ii) large margin separation between arguments of the different types. Inspired by prototype networks, we introduce a new model named HMPEAE, which takes the two inductive biases above as targets to locate prototypes and guide the model to learn argument representations based on these prototypes.Specifically, we set multiple prototypes to represent each role to capture intra-class differences. Simultaneously, we use hypersphere as the output space for prototypes, defining large margin separation between prototypes to encourage the model to learn significant differences between different types of arguments effectively.We solve the “argument-prototype” assignment as an optimal transport problem to optimize the argument representation and minimize the absolute distance between arguments and prototypes to achieve compactness within sub-clusters. Experimental results on the RAMS and WikiEvents datasets show that HMPEAE achieves state-of-the-art performances.

2023

pdf bib abs

Dynamic Heterogeneous-Graph Reasoning with Language Models and Knowledge Representation Learning for Commonsense Question Answering
Yujie Wang | Hu Zhang | Jiye Liang | Ru Li
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recently, knowledge graphs (KGs) have won noteworthy success in commonsense question answering. Existing methods retrieve relevant subgraphs in the KGs through key entities and reason about the answer with language models (LMs) and graph neural networks. However, they ignore (i) optimizing the knowledge representation and structure of subgraphs and (ii) deeply fusing heterogeneous QA context with subgraphs. In this paper, we propose a dynamic heterogeneous-graph reasoning method with LMs and knowledge representation learning (DHLK), which constructs a heterogeneous knowledge graph (HKG) based on multiple knowledge sources and optimizes the structure and knowledge representation of the HKG using a two-stage pruning strategy and knowledge representation learning (KRL). It then performs joint reasoning by LMs and Relation Mask Self-Attention (RMSA). Specifically, DHLK filters key entities based on the dictionary vocabulary to achieve the first-stage pruning while incorporating the paraphrases in the dictionary into the subgraph to construct the HKG. Then, DHLK encodes and fuses the QA context and HKG using LM, and dynamically removes irrelevant KG entities based on the attention weights of LM for the second-stage pruning. Finally, DHLK introduces KRL to optimize the knowledge representation and perform answer reasoning on the HKG by RMSA.We evaluate DHLK at CommonsenseQA and OpenBookQA, and show its improvement on existing LM and LM+KG methods.

pdf bib abs

DocSplit: Simple Contrastive Pretraining for Large Document Embeddings
Yujie Wang | Mike Izbicki
Findings of the Association for Computational Linguistics: EMNLP 2023

Existing model pretraining methods only consider local information. For example, in the popular token masking strategy, the words closer to the masked token are more important for prediction than words far away. This results in pretrained models that generate high-quality sentence embeddings, but low-quality embeddings for large documents. We propose a new pretraining method called DocSplit which forces models to consider the entire global context of a large document. Our method uses a contrastive loss where the positive examples are randomly sampled sections of the input document, and negative examples are randomly sampled sections of unrelated documents. Like previous pretraining methods, DocSplit is fully unsupervised, easy to implement, and can be used to pretrain any model architecture. Our experiments show that DocSplit outperforms other pretraining methods for document classification, few shot learning, and information retrieval tasks.

Co-authors

Guangjun Zhang 3

Ya Su 2

Mike Izbicki 1

Yuanlong Wang 1

Venues

Fix author