Fei Cai


2022

pdf bib
DRK: Discriminative Rule-based Knowledge for Relieving Prediction Confusions in Few-shot Relation Extraction
Mengru Wang | Jianming Zheng | Fei Cai | Taihua Shao | Honghui Chen
Proceedings of the 29th International Conference on Computational Linguistics

Few-shot relation extraction aims to identify the relation type between entities in a given text in the low-resource scenario. Albeit much progress, existing meta-learning methods still fall into prediction confusions owing to the limited inference ability over shallow text features. To relieve these confusions, this paper proposes a discriminative rule-based knowledge (DRK) method. Specifically, DRK adopts a logic-aware inference module to ease the word-overlap confusion, which introduces a logic rule to constrain the inference process, thereby avoiding the adverse effect of shallow text features. Also, DRK employs a discrimination finding module to alleviate the entity-type confusion, which explores distinguishable text features via a hierarchical contrastive learning. We conduct extensive experiments on four types of meta tasks and the results show promising improvements from DRK (6.0% accuracy gains on average). Besides, error analyses reveal the word-overlap and entity-type errors are the main courses of mispredictions in few-shot relation extraction.

2020

pdf bib
Heterogeneous Graph Neural Networks to Predict What Happen Next
Jianming Zheng | Fei Cai | Yanxiang Ling | Honghui Chen
Proceedings of the 28th International Conference on Computational Linguistics

Given an incomplete event chain, script learning aims to predict the missing event, which can support a series of NLP applications. Existing work cannot well represent the heterogeneous relations and capture the discontinuous event segments that are common in the event chain. To address these issues, we introduce a heterogeneous-event (HeterEvent) graph network. In particular, we employ each unique word and individual event as nodes in the graph, and explore three kinds of edges based on realistic relations (e.g., the relations of word-and-word, word-and-event, event-and-event). We also design a message passing process to realize information interactions among homo or heterogeneous nodes. And the discontinuous event segments could be explicitly modeled by finding the specific path between corresponding nodes in the graph. The experimental results on one-step and multi-step inference tasks demonstrate that our ensemble model HeterEvent[W+E] can outperform existing baselines.