Braden Hancock
2019
Learning from Dialogue after Deployment: Feed Yourself, Chatbot!
Braden Hancock
|
Antoine Bordes
|
Pierre-Emmanuel Mazare
|
Jason Weston
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
The majority of conversations a dialogue agent sees over its lifetime occur after it has already been trained and deployed, leaving a vast store of potential training signal untapped. In this work, we propose the self-feeding chatbot, a dialogue agent with the ability to extract new training examples from the conversations it participates in. As our agent engages in conversation, it also estimates user satisfaction in its responses. When the conversation appears to be going well, the user’s responses become new training examples to imitate. When the agent believes it has made a mistake, it asks for feedback; learning to predict the feedback that will be given improves the chatbot’s dialogue abilities further. On the PersonaChat chit-chat dataset with over 131k training examples, we find that learning from dialogue with a self-feeding chatbot significantly improves performance, regardless of the amount of traditional supervision.
2018
Training Classifiers with Natural Language Explanations
Braden Hancock
|
Paroma Varma
|
Stephanie Wang
|
Martin Bringmann
|
Percy Liang
|
Christopher Ré
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100 faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.
Search
Fix data
Co-authors
- Antoine Bordes 1
- Martin Bringmann 1
- Percy Liang 1
- Pierre-Emmanuel Mazare 1
- Christopher Ré 1
- show all...
Venues
- acl2