Xiang Zhao

2025

Dynamic-prototype Contrastive Fine-tuning for Continual Few-shot Relation Extraction with Unseen Relation Detection
Si Miao Zhao | Zhen Tan | Ning Pang | Wei Dong Xiao | Xiang Zhao
Proceedings of the 31st International Conference on Computational Linguistics

Continual Few-shot Relation Extraction (CFRE) aims to continually learn new relations from limited labeled data while preserving knowledge about previously learned relations. Facing the inherent issue of catastrophic forgetting, previous approaches predominantly rely on memory replay strategies. However, they often overlook task interference in continual learning and the varying memory requirements for different relations. To address these shortcomings, we propose a novel framework, DPC-FT, which features: 1) a lightweight relation encoder for each task to mitigate negative knowledge transfer across tasks; 2) a dynamic prototype module to allocate less memory for easier relations and more memory for harder relations. Additionally, we introduce the None-Of-The-Above (NOTA) detection in CFRE and propose a threshold criterion to identify relations that have never been learned. Extensive experiments demonstrate the effectiveness and efficiency of our method in CFRE, making our approach more practical and comprehensive for real-world scenarios.

pdf bib abs

Semantic and Sentiment Dual-Enhanced Generative Model for Script Event Prediction
Feiyang Wu | Peixin Huang | Yanli Hu | Zhen Tan | Xiang Zhao
Proceedings of the 31st International Conference on Computational Linguistics

Script Event Prediction (SEP) aims to forecast the next event in a sequence from a list of candidates. Traditional methods often use pre-trained language models to model event associations but struggle with semantic ambiguity and embedding bias. Semantic ambiguity arises from the multiple meanings of identical words and insufficient consideration of event arguments, while embedding bias results from assigning similar word embeddings to event pairs with similar lexical features, despite their different meanings. To address above issues, we propose a the Semantic and Sentiment Dual-enhanced Generative Model (SSD-GM). SSD-GM leverages two types of script event information to enhance the generative model. Specifically, it employs a GNN-based semantic structure aggregator to integrate the event-centric structure information, thereby mitigating the impact of semantic ambiguity. Furthermore, we find that local sentiment variability effectively reduces biases in event embeddings, while maintaining global sentiment consistency enhances predictive accuracy. As a result, SSD-GM adeptly captures both global and local sentiment of events through its sentiment information awareness mechanism. Extensive experiments on the Multi-Choice Narrative Cloze (MCNC) task demonstrate that our approach achieves better results than other state-of-the-art baselines.

pdf bib abs

How Do Social Bots Participate in Misinformation Spread? A Comprehensive Dataset and Analysis
Herun Wan | Minnan Luo | Zihan Ma | Guang Dai | Xiang Zhao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Social media platforms provide an ideal environment to spread misinformation, where social bots can accelerate the spread. This paper explores the interplay between social bots and misinformation on the Sina Weibo platform. We construct a large-scale dataset that includes annotations for both misinformation and social bots. From the misinformation perspective, the dataset is multimodal, containing 11,393 pieces of misinformation and 16,416 pieces of verified information. From the social bot perspective, this dataset contains 65,749 social bots and 345,886 genuine accounts, annotated using a weakly supervised annotator. Extensive experiments demonstrate the comprehensiveness of the dataset, the clear distinction between misinformation and real information, and the high quality of social bot annotations. Further analysis illustrates that: (i) social bots are deeply involved in information spread; (ii) misinformation with the same topics has similar content, providing the basis of echo chambers, and social bots would amplify this phenomenon; and (iii) social bots generate similar content aiming to manipulate public opinions.

pdf bib abs

Multi-Modal Entities Matter: Benchmarking Multi-Modal Entity Alignment
GuanChen Xiao | WeiXin Zeng | ShiQi Zhang | MingRui Lao | Xiang Zhao
Proceedings of the 31st International Conference on Computational Linguistics

Multi-modal entity alignment (MMEA) is a long-standing task that aims to discover identical entities between different multi-modal knowledge graphs (MMKGs). However, most of the existing MMEA datasets consider the multi-modal data as the attributes of textual entities, while neglecting the correlations among the multi-modal data and do not fit in the real-world scenarios well. In response, in this work, we establish a novel yet practical MMEA dataset, i.e. NMMEA, which models multi-modal data (e.g., images) equally as textual entities in the MMKG. Due to the introduction of multi-modal data, NMMEA poses new challenges to existing MMEA solutions, i.e., heterogeneous structural representation learning and cross-modal alignment inference. Hence, we put forward a simple yet effective solution, CrossEA, which can effectively learn the structural information of entities by considering both intra-modal and cross-modal relations, and further infer the similarity of different types of entity pairs. Extensive experiments validate the significance of NMMEA, where CrossEA can achieve superior performance in contrast to competitive methods on the proposed dataset.

pdf bib abs

On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs
Herun Wan | Minnan Luo | Zhixiong Su | Guang Dai | Xiang Zhao
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Evidence-enhanced detectors present remarkable abilities in identifying malicious social text. However, the rise of large language models (LLMs) brings potential risks of evidence pollution to confuse detectors. This paper explores potential manipulation scenarios including basic pollution, and rephrasing or generating evidence by LLMs. To mitigate the negative impact, we propose three defense strategies from the data and model sides, including machine-generated text detection, a mixture of experts, and parameter updating. Extensive experiments on four malicious social text detection tasks with ten datasets illustrate that evidence pollution significantly compromises detectors, where the generating strategy causes up to a 14.4% performance drop. Meanwhile, the defense strategies could mitigate evidence pollution, but they faced limitations for practical employment. Further analysis illustrates that polluted evidence (i) is of high quality, evaluated by metrics and humans; (ii) would compromise the model calibration, increasing expected calibration error up to 21.6%; and (iii) could be integrated to amplify the negative impact, especially for encoder-based LMs, where the accuracy drops by 21.8%.

2024

pdf bib abs

Event-Radar: Event-driven Multi-View Learning for Multimodal Fake News Detection
Zihan Ma | Minnan Luo | Hao Guo | Zhi Zeng | Yiran Hao | Xiang Zhao
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The swift detection of multimedia fake news has emerged as a crucial task in combating malicious propaganda and safeguarding the security of the online environment. While existing methods have achieved commendable results in modeling entity-level inconsistency, addressing event-level inconsistency following the inherent subject-predicate logic of news and robustly learning news representations from poor-quality news samples remain two challenges. In this paper, we propose an Event-diven fake news detection framework (Event-Radar) based on multi-view learning, which integrates visual manipulation, textual emotion and multimodal inconsistency at event-level for fake news detection. Specifically, leveraging the capability of graph structures to capture interactions between events and parameters, Event-Radar captures event-level multimodal inconsistency by constructing an event graph that includes multimodal entity subject-predicate logic. Additionally, to mitigate the interference of poor-quality news, Event-Radar introduces a multi-view fusion mechanism, learning comprehensive and robust representations by computing the credibility of each view as a clue, thereby detecting fake news. Extensive experiments demonstrate that Event-Radar achieves outstanding performance on three large-scale fake news detection benchmarks. Our studies also confirm that Event-Radar exhibits strong robustness, providing a paradigm for detecting fake news from noisy news samples.

pdf bib abs

Temporal Knowledge Question Answering via Abstract Reasoning Induction
Ziyang Chen | Dongfang Li | Xiang Zhao | Baotian Hu | Min Zhang
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this study, we address the challenge of enhancing temporal knowledge reasoning in Large Language Models (LLMs). LLMs often struggle with this task, leading to the generation of inaccurate or misleading responses. This issue mainly arises from their limited ability to handle evolving factual knowledge and complex temporal logic. To overcome these limitations, we propose Abstract Reasoning Induction (ARI) framework, which divides temporal reasoning into two distinct phases: Knowledge agnostic and Knowledge-based. This framework offers factual knowledge support to LLMs while minimizing the incorporation of extraneous noisy data. Concurrently, informed by the principles of constructivism, ARI provides LLMs the capability to engage in proactive, self-directed learning from both correct and incorrect historical reasoning samples. By teaching LLMs to actively construct knowledge and methods, it can significantly boosting their temporal reasoning abilities. Our approach achieves significant improvements, with relative gains of 29.7% and 9.27% on two temporal QA datasets, underscoring its efficacy in advancing temporal reasoning in LLMs. The code can be found at https: //github.com/czy1999/ARI-QA.

pdf bib abs

SCL: Selective Contrastive Learning for Data-driven Zero-shot Relation Extraction
Ning Pang | Xiang Zhao | Weixin Zeng | Zhen Tan | Weidong Xiao
Transactions of the Association for Computational Linguistics, Volume 12

Relation extraction has evolved from supervised relation extraction to zero-shot setting due to the continuous emergence of newly generated relations. Some pioneering works handle zero-shot relation extraction by reformulating it into proxy tasks, such as reading comprehension and textual entailment. Nonetheless, the divergence in proxy task formulations from relation extraction hinders the acquisition of informative semantic representations, leading to subpar performance. Therefore, in this paper, we take a data-driven view to handle zero-shot relation extraction under a three-step paradigm, including encoder training, relation clustering, and summarization. Specifically, to train a discriminative relational encoder, we propose a novel selective contrastive learning framework, namely, SCL, where selective importance scores are assigned to distinguish the importance of different negative contrastive instances. During testing, the prompt-based encoder is employed to map test samples into representation vectors, which are then clustered into several groups. Typical samples closest to the cluster centroid are selected for summarization to generate the predicted relation for all samples in the cluster. Moreover, we design a simple non-parametric threshold plugin to reduce false-positive errors in inference on unseen relation representations. Our experiments demonstrate that SCL outperforms the current state-of-the-art method by over 3% across all metrics.

pdf bib abs

Distill, Fuse, Pre-train: Towards Effective Event Causality Identification with Commonsense-Aware Pre-trained Model
Peixin Huang | Xiang Zhao | Minghao Hu | Zhen Tan | Weidong Xiao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Event Causality Identification (ECI) aims to detect causal relations between events in unstructured texts. This task is challenged by the lack of data and explicit causal clues. Some methods incorporate explicit knowledge from external knowledge graphs (KGs) into Pre-trained Language Models (PLMs) to tackle these issues, achieving certain accomplishments. However, they ignore that existing KGs usually contain trivial knowledge which may prejudice the performance. Moreover, they simply integrate the concept triplets, underutilizing the deep interaction between the text and external graph. In this paper, we propose an effective pipeline DFP, i.e., Distill, Fuse and Pre-train, to build a commonsense-aware pre-trained model which integrates reliable task-specific knowledge from commonsense graphs. This pipeline works as follows: (1) To leverage the reliable knowledge, commonsense graph distillation is proposed to distill commonsense graphs and obtain the meta-graph which contain credible task-oriented knowledge. (2) To model the deep interaction between the text and external graph, heterogeneous information fusion is proposed to fuse them through a commonsense-aware memory network. (3) Continual pre-training designs three continual pre-training tasks to further align and fuse the text and the commonsense meta-graph. Through extensive experiments on two benchmarks, we demonstrate the validity of our pipeline.

2023

pdf bib abs

Multi-granularity Temporal Question Answering over Knowledge Graphs
Ziyang Chen | Jinzhi Liao | Xiang Zhao
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recently, question answering over temporal knowledge graphs (i.e., TKGQA) has been introduced and investigated, in quest of reasoning about dynamic factual knowledge. To foster research on TKGQA, a few datasets have been curated (e.g., CronQuestions and Complex-CronQuestions), and various models have been proposed based on these datasets. Nevertheless, existing efforts overlook the fact that real-life applications of TKGQA also tend to be complex in temporal granularity, i.e., the questions may concern mixed temporal granularities (e.g., both day and month). To overcome the limitation, in this paper, we motivate the notion of multi-granularity temporal question answering over knowledge graphs and present a large scale dataset for multi-granularity TKGQA, namely MultiTQ. To the best of our knowledge, MultiTQis among the first of its kind, and compared with existing datasets on TKGQA, MultiTQfeatures at least two desirable aspects—ample relevant facts and multiple temporal granularities. It is expected to better reflect real-world challenges, and serve as a test bed for TKGQA models. In addition, we propose a competing baseline MultiQA over MultiTQ, which is experimentally demonstrated to be effective in dealing with TKGQA. The data and code are released at https://github.com/czy1999/MultiTQ.

pdf bib abs

T 2 -NER: A Two-Stage Span-Based Framework for Unified Named Entity Recognition with Templates
Peixin Huang | Xiang Zhao | Minghao Hu | Zhen Tan | Weidong Xiao
Transactions of the Association for Computational Linguistics, Volume 11

Named Entity Recognition (NER) has so far evolved from the traditional flat NER to overlapped and discontinuous NER. They have mostly been solved separately, with only several exceptions that concurrently tackle three tasks with a single model. The current best-performing method formalizes the unified NER as word-word relation classification, which barely focuses on mention content learning and fails to detect entity mentions comprising a single word. In this paper, we propose a two-stage span-based framework with templates, namely, T2-NER, to resolve the unified NER task. The first stage is to extract entity spans, where flat and overlapped entities can be recognized. The second stage is to classify over all entity span pairs, where discontinuous entities can be recognized. Finally, multi-task learning is used to jointly train two stages. To improve the efficiency of span-based model, we design grouped templates and typed templates for two stages to realize batch computations. We also apply an adjacent packing strategy and a latter packing strategy to model discriminative boundary information and learn better span (pair) representation. Moreover, we introduce the syntax information to enhance our span representation. We perform extensive experiments on eight benchmark datasets for flat, overlapped, and discontinuous NER, where our model beats all the current competitive baselines, obtaining the best performance of unified NER.

2022

pdf bib abs

Extract-Select: A Span Selection Framework for Nested Named Entity Recognition with Generative Adversarial Training
Peixin Huang | Xiang Zhao | Minghao Hu | Yang Fang | Xinyi Li | Weidong Xiao
Findings of the Association for Computational Linguistics: ACL 2022

Nested named entity recognition (NER) is a task in which named entities may overlap with each other. Span-based approaches regard nested NER as a two-stage span enumeration and classification task, thus having the innate ability to handle this task. However, they face the problems of error propagation, ignorance of span boundary, difficulty in long entity recognition and requirement on large-scale annotated data. In this paper, we propose Extract-Select, a span selection framework for nested NER, to tackle these problems. Firstly, we introduce a span selection framework in which nested entities with different input categories would be separately extracted by the extractor, thus naturally avoiding error propagation in two-stage span-based approaches. In the inference phase, the trained extractor selects final results specific to the given entity category. Secondly, we propose a hybrid selection strategy in the extractor, which not only makes full use of span boundary but also improves the ability of long entity recognition. Thirdly, we design a discriminator to evaluate the extraction result, and train both extractor and discriminator with generative adversarial training (GAT). The use of GAT greatly alleviates the stress on the dataset size. Experimental results on four benchmark datasets demonstrate that Extract-Select outperforms competitive nested NER models, obtaining state-of-the-art results. The proposed model also performs well when less labeled data are given, proving the effectiveness of GAT.

2021

pdf bib abs

Relation-aware Bidirectional Path Reasoning for Commonsense Question Answering
Junxing Wang | Xinyi Li | Zhen Tan | Xiang Zhao | Weidong Xiao
Proceedings of the 25th Conference on Computational Natural Language Learning

Commonsense Question Answering is an important natural language processing (NLP) task that aims to predict the correct answer to a question through commonsense reasoning. Previous studies utilize pre-trained models on large-scale corpora such as BERT, or perform reasoning on knowledge graphs. However, these methods do not explicitly model the relations that connect entities, which are informational and can be used to enhance reasoning. To address this issue, we propose a relation-aware reasoning method. Our method uses a relation-aware graph neural network to capture the rich contextual information from both entities and relations. Compared with methods that use fixed relation embeddings from pre-trained models, our model dynamically updates relations with contextual information from a multi-source subgraph, built from multiple external knowledge sources. The enhanced representations of relations are then fed to a bidirectional reasoning module. A bidirectional attention mechanism is applied between the question sequence and the paths that connect entities, which provides us with transparent interpretability. Experimental results on the CommonsenseQA dataset illustrate that our method results in significant improvements over the baselines while also providing clear reasoning paths.

2020

pdf bib abs

CLEEK: A Chinese Long-text Corpus for Entity Linking
Weixin Zeng | Xiang Zhao | Jiuyang Tang | Zhen Tan | Xuqian Huang
Proceedings of the Twelfth Language Resources and Evaluation Conference

Entity linking, as one of the fundamental tasks in natural language processing, is crucial to knowledge fusion, knowledge base construction and update. Nevertheless, in contrast to the research on entity linking for English text, which undergoes continuous development, the Chinese counterpart is still in its infancy. One prominent issue lies in publicly available annotated datasets and evaluation benchmarks, which are lacking and deficient. In specific, existing Chinese corpora for entity linking were mainly constructed from noisy short texts, such as microblogs and news headings, where long texts were largely overlooked, which yet constitute a wider spectrum of real-life scenarios. To address the issue, in this work, we build CLEEK, a Chinese corpus of multi-domain long text for entity linking, in order to encourage advancement of entity linking in languages besides English. The corpus consists of 100 documents from diverse domains, and is publicly accessible. Moreover, we devise a measure to evaluate the difficulty of documents with respect to entity linking, which is then used to characterize the corpus. Additionally, the results of two baselines and seven state-of-the-art solutions on CLEEK are reported and compared. The empirical results validate the usefulness of CLEEK and the effectiveness of proposed difficulty measure.

pdf bib abs

Joint Event Extraction with Hierarchical Policy Network
Peixin Huang | Xiang Zhao | Ryuichi Takanobu | Zhen Tan | Weidong Xiao
Proceedings of the 28th International Conference on Computational Linguistics

Most existing work on event extraction (EE) either follows a pipelined manner or uses a joint structure but is pipelined in essence. As a result, these efforts fail to utilize information interactions among event triggers, event arguments, and argument roles, which causes information redundancy. In view of this, we propose to exploit the role information of the arguments in an event and devise a Hierarchical Policy Network (HPNet) to perform joint EE. The whole EE process is fulfilled through a two-level hierarchical structure consisting of two policy networks for event detection and argument detection. The deep information interactions among the subtasks are realized, and it is more natural to deal with multiple events issue. Extensive experiments on ACE2005 and TAC2015 demonstrate the superiority of HPNet, leading to state-of-the-art performance and is more powerful for sentences with multiple events.