Guideline Bias in Wizard-of-Oz Dialogues

Victor Petrén Bach Hansen; Anders Søgaard

doi:10.18653/v1/2021.bppf-1.2

Guideline Bias in Wizard-of-Oz Dialogues

Victor Petrén Bach Hansen, Anders Søgaard

Abstract

NLP models struggle with generalization due to sampling and annotator bias. This paper focuses on a different kind of bias that has received very little attention: guideline bias, i.e., the bias introduced by how our annotator guidelines are formulated. We examine two recently introduced dialogue datasets, CCPE-M and Taskmaster-1, both collected by trained assistants in a Wizard-of-Oz set-up. For CCPE-M, we show how a simple lexical bias for the word like in the guidelines biases the data collection. This bias, in effect, leads to poor performance on data without this bias: a preference elicitation architecture based on BERT suffers a 5.3% absolute drop in performance, when like is replaced with a synonymous phrase, and a 13.2% drop in performance when evaluated on out-of-sample data. For Taskmaster-1, we show how the order in which instructions are resented, biases the data collection.

Anthology ID:: 2021.bppf-1.2
Volume:: Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future
Month:: Aug
Year:: 2021
Address:: Online
Editors:: Kenneth Church, Mark Liberman, Valia Kordoni
Venue:: BPPF
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8–14
Language:
URL:: https://aclanthology.org/2021.bppf-1.2/
DOI:: 10.18653/v1/2021.bppf-1.2
Bibkey:
Cite (ACL):: Victor Petrén Bach Hansen and Anders Søgaard. 2021. Guideline Bias in Wizard-of-Oz Dialogues. In Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future, pages 8–14, Online. Association for Computational Linguistics.
Cite (Informal):: Guideline Bias in Wizard-of-Oz Dialogues (Hansen & Søgaard, BPPF 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.bppf-1.2.pdf

PDF Cite Search Fix data