Jie Gong
2025
Extracting Military Event Temporal Relations via Relative Event Time Prediction and Virtual Adversarial Training
Jie Gong
|
Qiwang Hu
Findings of the Association for Computational Linguistics: NAACL 2025
Extracting temporal relationships between events in the text is crucial for understanding how events unfold over time, especially in the information-dense and precision-demanding military field. Existing models for extracting event temporal relations typically compare the relative times of events directly, neglecting the contextual information between event pairs. This can lead to difficulties in handling uncertain temporal boundaries expressed in text. In this paper, we propose an event temporal relationship extraction model for the military field, based on relative event time prediction and virtual adversarial training, MFRV. The relative event time prediction as an auxiliary task enhances the model’s ability to capture and infer temporal relationships. Virtual adversarial training increases the model’s generalization by generating adversarial samples. Additionally, we adopt the MoCo (Multi-objective gradient correction) method to balance the losses from relative event time prediction and virtual adversarial training, effectively resolving the gradient bias issue in multi-objective optimization. Furthermore, we have constructed a new dataset, TRMF, specifically for event temporal relationship extraction in the military field. Experiments conducted on TRMF, as well as widely used public datasets MATRES and TCR, demonstrate the effectiveness of MFRV.
EventRelBench: A Comprehensive Benchmark for Evaluating Event Relation Understanding in Large Language Models
Jie Gong
|
Biaoshuai Zheng
|
Qiwang Hu
Findings of the Association for Computational Linguistics: EMNLP 2025
Understanding event relationships is critical for tasks such as narrative comprehension, information extraction, and reasoning in natural language processing. Despite the remarkable advancements of large language models (LLMs) across diverse NLP tasks, current studies have not systematically evaluated their ability to capture the complex of event relations. To this end, we aim to assess LLMs on event relationship extraction (ERE) by designing the benchmark EventRelBench. EventRelBench comprises 35K diverse event relation questions covering four key categories—coreference, temporal, causal, and supersub relations. These questions are provided at two levels of granularity: document-level and sentence-level. Extensive experiments on different sizes and types of LLMs show that existing LLMs still fall short in accurately extracting and understanding event relationships. To address this gap, we introduce EventRelInst, a 48K instruction fine‐tuning dataset in the event relation extraction domain. Experimental results not only highlight the shortcomings of current general-purpose LLMs in extracting event relationships but also demonstrate the effectiveness of EventRelInst. Both EventRelBench and EventRelBench will be publicly available.