Efficiently Acquiring Human Feedback with Bayesian Deep Learning

Haishuo Fang, Jeet Gor, Edwin Simpson


Abstract
Learning from human feedback can improve models for text generation or passage ranking, aligning them better to a user’s needs. Data is often collected by asking users to compare alternative outputs to a given input, which may require a large number of comparisons to learn a ranking function. The amount of comparisons needed can be reduced using Bayesian Optimisation (BO) to query the user about only the most promising candidate outputs. Previous applications of BO to text ranking relied on shallow surrogate models to learn ranking functions over candidate outputs,and were therefore unable to fine-tune rankers based on deep, pretrained language models. This paper leverages Bayesian deep learning (BDL) to adapt pretrained language models to highly specialised text ranking tasks, using BO to tune the model with a small number of pairwise preferences between candidate outputs. We apply our approach to community question answering (cQA) and extractive multi-document summarisation (MDS) with simulated noisy users, finding that our BDL approach significantly outperforms both a shallow Gaussian process model and traditional active learning with a standard deep neural network, while remaining robust to noise in the user feedback.
Anthology ID:
2024.uncertainlp-1.7
Volume:
Proceedings of the 1st Workshop on Uncertainty-Aware NLP (UncertaiNLP 2024)
Month:
March
Year:
2024
Address:
St Julians, Malta
Editors:
Raúl Vázquez, Hande Celikkanat, Dennis Ulmer, Jörg Tiedemann, Swabha Swayamdipta, Wilker Aziz, Barbara Plank, Joris Baan, Marie-Catherine de Marneffe
Venues:
UncertaiNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
70–80
Language:
URL:
https://aclanthology.org/2024.uncertainlp-1.7
DOI:
Bibkey:
Cite (ACL):
Haishuo Fang, Jeet Gor, and Edwin Simpson. 2024. Efficiently Acquiring Human Feedback with Bayesian Deep Learning. In Proceedings of the 1st Workshop on Uncertainty-Aware NLP (UncertaiNLP 2024), pages 70–80, St Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
Efficiently Acquiring Human Feedback with Bayesian Deep Learning (Fang et al., UncertaiNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.uncertainlp-1.7.pdf