Classify First, and Then Extract: Prompt Chaining Technique for Information Extraction

Alice Kwak, Clayton Morrison, Derek Bambauer, Mihai Surdeanu


Abstract
This work presents a new task-aware prompt design and example retrieval approach for information extraction (IE) using a prompt chaining technique. Our approach divides IE tasks into two steps: (1) text classification to understand what information (e.g., entity or event types) is contained in the underlying text and (2) information extraction for the identified types. Initially, we use a large language model (LLM) in a few-shot setting to classify the contained information. The classification output is used to select the relevant prompt and retrieve the examples relevant to the input text. Finally, we ask a LLM to do the information extraction with the generated prompt. By evaluating our approach on legal IE tasks with two different LLMs, we demonstrate that the prompt chaining technique improves the LLM’s overall performance in a few-shot setting when compared to the baseline in which examples from all possible classes are included in the prompt. Our approach can be used in a low-resource setting as it does not require a large amount of training data. Also, it can be easily adapted to many different IE tasks by simply adjusting the prompts. Lastly, it provides a cost benefit by reducing the number of tokens in the prompt.
Anthology ID:
2024.nllp-1.25
Volume:
Proceedings of the Natural Legal Language Processing Workshop 2024
Month:
November
Year:
2024
Address:
Miami, FL, USA
Editors:
Nikolaos Aletras, Ilias Chalkidis, Leslie Barrett, Cătălina Goanță, Daniel Preoțiuc-Pietro, Gerasimos Spanakis
Venue:
NLLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
303–317
Language:
URL:
https://aclanthology.org/2024.nllp-1.25
DOI:
Bibkey:
Cite (ACL):
Alice Kwak, Clayton Morrison, Derek Bambauer, and Mihai Surdeanu. 2024. Classify First, and Then Extract: Prompt Chaining Technique for Information Extraction. In Proceedings of the Natural Legal Language Processing Workshop 2024, pages 303–317, Miami, FL, USA. Association for Computational Linguistics.
Cite (Informal):
Classify First, and Then Extract: Prompt Chaining Technique for Information Extraction (Kwak et al., NLLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nllp-1.25.pdf