LogicST: A Logical Self-Training Framework for Document-Level Relation Extraction with Incomplete Annotations

Shengda Fan; Yanting Wang; Shasha Mo; Jianwei Niu

LogicST: A Logical Self-Training Framework for Document-Level Relation Extraction with Incomplete Annotations

Shengda Fan, Yanting Wang, Shasha Mo, Jianwei Niu

Abstract

Document-level relation extraction (DocRE) aims to identify relationships between entities within a document. Due to the vast number of entity pairs, fully annotating all fact triplets is challenging, resulting in datasets with numerous false negative samples. Recently, self-training-based methods have been introduced to address this issue. However, these methods are purely black-box and sub-symbolic, making them difficult to interpret and prone to overlooking symbolic interdependencies between relations.To remedy this deficiency, our insight is that symbolic knowledge, such as logical rules, can be used as diagnostic tools to identify conflicts between pseudo-labels. By resolving these conflicts through logical diagnoses, we can correct erroneous pseudo-labels, thus enhancing the training of neural models.To achieve this, we propose **LogicST**, a neural-logic self-training framework that iteratively resolves conflicts and constructs the minimal diagnostic set for updating models. Extensive experiments demonstrate that LogicST significantly improves performance and outperforms previous state-of-the-art methods. For instance, LogicST achieves an increase of **7.94%** in F1 score compared to CAST (Tan et al., 2023a) on the DocRED benchmark (Yao et al., 2019). Additionally, LogicST is more time-efficient than its self-training counterparts, requiring only **10%** of the training time of CAST.

Anthology ID:: 2024.emnlp-main.314
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5496–5510
Language:
URL:: https://aclanthology.org/2024.emnlp-main.314
DOI:
Bibkey:
Cite (ACL):: Shengda Fan, Yanting Wang, Shasha Mo, and Jianwei Niu. 2024. LogicST: A Logical Self-Training Framework for Document-Level Relation Extraction with Incomplete Annotations. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 5496–5510, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: LogicST: A Logical Self-Training Framework for Document-Level Relation Extraction with Incomplete Annotations (Fan et al., EMNLP 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.emnlp-main.314.pdf
Software:: 2024.emnlp-main.314.software.zip
Data:: 2024.emnlp-main.314.data.zip

PDF Cite Search Software Data