A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution

Bowen Ding, Qingkai Min, Shengkun Ma, Yingjie Li, Linyi Yang, Yue Zhang


Abstract
Based on Pre-trained Language Models (PLMs), event coreference resolution (ECR) systems have demonstrated outstanding performance in clustering coreferential events across documents. However, the state-of-the-art system exhibits an excessive reliance on the ‘triggers lexical matching’ spurious pattern in the input mention pair text. We formalize the decision-making process of the baseline ECR system using a Structural Causal Model (SCM), aiming to identify spurious and causal associations (i.e., rationales) within the ECR task. Leveraging the debiasing capability of counterfactual data augmentation, we develop a rationale-centric counterfactual data augmentation method with LLM-in-the-loop. This method is specialized for pairwise input in the ECR system, where we conduct direct interventions on triggers and context to mitigate the spurious association while emphasizing the causation. Our approach achieves state-of-the-art performance on three popular cross-document ECR benchmarks and demonstrates robustness in out-of-domain scenarios.
Anthology ID:
2024.naacl-long.63
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1112–1140
Language:
URL:
https://aclanthology.org/2024.naacl-long.63
DOI:
Bibkey:
Cite (ACL):
Bowen Ding, Qingkai Min, Shengkun Ma, Yingjie Li, Linyi Yang, and Yue Zhang. 2024. A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1112–1140, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution (Ding et al., NAACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.naacl-long.63.pdf
Copyright:
 2024.naacl-long.63.copyright.pdf