CycleOIE: A Low-Resource Training Framework For Open Information Extraction

Zhihong Jin, Chunhong Zhang, Zheng Hu, Jibin Yu, Ruiqi Ma, Qingyun Chen, Xiaohao Liao, Yanxing Zhang


Abstract
Open Information Extraction (OpenIE) aims to extract structured information in the form of triples from unstructured text, serving as a foundation for various downstream NLP tasks. Despite the success of neural OpenIE models, their dependence on large-scale annotated datasets poses a challenge, particularly in low-resource settings. In this paper, we introduce a novel approach to address the low-resource OpenIE task through two key innovations: (1) we improve the quality of training data by curating small-scale, high-quality datasets annotated by a large language model (GPT-3.5), leveraging both OpenIE principles and few-shot examples to form LSOIE-g principles and LSOIE-g examples; (2) we propose CycleOIE, a training framework that maximizes data efficiency through a cycle-consistency mechanism, enabling the model to learn effectively from minimal data. Experimental results show that CycleOIE, when trained on only 2k+ instances, achieves comparable results to models trained on over 90k instances. Our contributions are further validated through extensive experiments, demonstrating the superior performance of CycleOIE and our curated LSOIE-g datasets in low-resource OpenIE as well as revealing the internal mechanisms of CycleOIE.
Anthology ID:
2025.coling-main.227
Volume:
Proceedings of the 31st International Conference on Computational Linguistics
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3372–3390
Language:
URL:
https://aclanthology.org/2025.coling-main.227/
DOI:
Bibkey:
Cite (ACL):
Zhihong Jin, Chunhong Zhang, Zheng Hu, Jibin Yu, Ruiqi Ma, Qingyun Chen, Xiaohao Liao, and Yanxing Zhang. 2025. CycleOIE: A Low-Resource Training Framework For Open Information Extraction. In Proceedings of the 31st International Conference on Computational Linguistics, pages 3372–3390, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
CycleOIE: A Low-Resource Training Framework For Open Information Extraction (Jin et al., COLING 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.coling-main.227.pdf