Active Learning for Robust and Representative LLM Generation in Safety-Critical Scenarios

Sabit Hassan; Anthony Sicilia; Malihe Alikhani

doi:10.18653/v1/2024.customnlp4u-1.10

Active Learning for Robust and Representative LLM Generation in Safety-Critical Scenarios

Sabit Hassan, Anthony Sicilia, Malihe Alikhani

Abstract

Ensuring robust safety measures across a wide range of scenarios is crucial for user-facing systems. While Large Language Models (LLMs) can generate valuable data for safety measures, they often exhibit distributional biases, focusing on common scenarios and neglecting rare but critical cases. This can undermine the effectiveness of safety protocols developed using such data. To address this, we propose a novel framework that integrates active learning with clustering to guide LLM generation, enhancing their representativeness and robustness in safety scenarios. We demonstrate the effectiveness of our approach by constructing a dataset of 5.4K potential safety violations through an iterative process involving LLM generation and an active learner model’s feedback. Our results show that the proposed framework produces a more representative set of safety scenarios without requiring prior knowledge of the underlying data distribution. Additionally, data acquired through our method improves the accuracy and F1 score of both the active learner model as well models outside the scope of active learning process, highlighting its broad applicability.

Anthology ID:: 2024.customnlp4u-1.10
Volume:: Proceedings of the 1st Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (CustomNLP4U)
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Sachin Kumar, Vidhisha Balachandran, Chan Young Park, Weijia Shi, Shirley Anugrah Hayati, Yulia Tsvetkov, Noah Smith, Hannaneh Hajishirzi, Dongyeop Kang, David Jurgens
Venues:: CustomNLP4U | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 113–123
Language:
URL:: https://aclanthology.org/2024.customnlp4u-1.10/
DOI:: 10.18653/v1/2024.customnlp4u-1.10
Bibkey:
Cite (ACL):: Sabit Hassan, Anthony Sicilia, and Malihe Alikhani. 2024. Active Learning for Robust and Representative LLM Generation in Safety-Critical Scenarios. In Proceedings of the 1st Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (CustomNLP4U), pages 113–123, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Active Learning for Robust and Representative LLM Generation in Safety-Critical Scenarios (Hassan et al., CustomNLP4U 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.customnlp4u-1.10.pdf

PDF Cite Search Fix data