Active Learning Principles for In-Context Learning with Large Language Models

Katerina Margatina; Timo Schick; Nikolaos Aletras; Jane Dwivedi-Yu

doi:10.18653/v1/2023.findings-emnlp.334

Active Learning Principles for In-Context Learning with Large Language Models

Katerina Margatina, Timo Schick, Nikolaos Aletras, Jane Dwivedi-Yu

Abstract

The remarkable advancements in large language models (LLMs) have significantly enhanced predictive performance in few-shot learning settings. By using only a small number of labeled examples, referred to as demonstrations, LLMs can effectively perform the task at hand through in-context learning. However, the process of selecting demonstrations for maximizing performance has received limited attention in prior work. This paper addresses the issue of identifying the most informative demonstrations for few-shot learning by approaching it as a pool-based Active Learning (AL) problem over a single iteration. We compare standard AL algorithms based on uncertainty, diversity, and similarity, and consistently observe that the latter outperforms all other methods, including random sampling. Our extensive experimentation involving a diverse range of GPT and OPT models across 24 classification and multi-choice tasks, coupled with thorough analysis, unambiguously demonstrates the importance of using demonstrations that are semantically similar to the domain of the test examples. In fact, we show higher average classification performance using “similar” demonstrations with GPT-2 (124M) than random demonstrations with GPT-Neox (20B). Notably, while diversity sampling shows promise, uncertainty sampling, despite its success in conventional supervised learning AL scenarios, performs poorly in in-context learning.

Anthology ID:: 2023.findings-emnlp.334
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2023
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5011–5034
Language:
URL:: https://aclanthology.org/2023.findings-emnlp.334
DOI:: 10.18653/v1/2023.findings-emnlp.334
Bibkey:
Cite (ACL):: Katerina Margatina, Timo Schick, Nikolaos Aletras, and Jane Dwivedi-Yu. 2023. Active Learning Principles for In-Context Learning with Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 5011–5034, Singapore. Association for Computational Linguistics.
Cite (Informal):: Active Learning Principles for In-Context Learning with Large Language Models (Margatina et al., Findings 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.findings-emnlp.334.pdf

PDF Cite Search