Instruct and Extract: Instruction Tuning for On-Demand Information Extraction

Yizhu Jiao; Ming Zhong; Sha Li; Ruining Zhao; Siru Ouyang; Heng Ji; Jiawei Han

doi:10.18653/v1/2023.emnlp-main.620

Instruct and Extract: Instruction Tuning for On-Demand Information Extraction

Yizhu Jiao, Ming Zhong, Sha Li, Ruining Zhao, Siru Ouyang, Heng Ji, Jiawei Han

Abstract

Large language models with instruction-following capabilities open the door to a wider group of users. However, when it comes to information extraction – a classic task in natural language processing – most task-specific systems cannot align well with long-tail ad hoc extraction use cases for non-expert users. To address this, we propose a novel paradigm, termed On-Demand Information Extraction, to fulfill the personalized demands of real-world users. Our task aims to follow the instructions to extract the desired content from the associated text and present it in a structured tabular format. The table headers can either be user-specified or inferred contextually by the model. To facilitate research in this emerging area, we present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set. Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE. Comprehensive evaluations on our benchmark reveal that ODIE substantially outperforms the existing open-source models of similar size.

Anthology ID:: 2023.emnlp-main.620
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10030–10051
Language:
URL:: https://aclanthology.org/2023.emnlp-main.620
DOI:: 10.18653/v1/2023.emnlp-main.620
Bibkey:
Cite (ACL):: Yizhu Jiao, Ming Zhong, Sha Li, Ruining Zhao, Siru Ouyang, Heng Ji, and Jiawei Han. 2023. Instruct and Extract: Instruction Tuning for On-Demand Information Extraction. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10030–10051, Singapore. Association for Computational Linguistics.
Cite (Informal):: Instruct and Extract: Instruction Tuning for On-Demand Information Extraction (Jiao et al., EMNLP 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.emnlp-main.620.pdf
Video:: https://aclanthology.org/2023.emnlp-main.620.mp4

PDF Cite Search Video