Melanie Bradford


2023

pdf bib
Reducing cohort bias in natural language understanding systems with targeted self-training scheme
Dieu-thu Le | Gabriela Hernandez | Bei Chen | Melanie Bradford
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)

Bias in machine learning models can be an issue when the models are trained on particular types of data that do not generalize well, causing under performance in certain groups of users. In this work, we focus on reducing the bias related to new customers in a digital voice assistant system. It is observed that natural language understanding models often have lower performance when dealing with requests coming from new users rather than experienced users. To mitigate this problem, we propose a framework that consists of two phases (1) a fixing phase with four active learning strategies used to identify important samples coming from new users, and (2) a self training phase where a teacher model trained from the first phase is used to annotate semi-supervised samples to expand the training data with relevant cohort utterances. We explain practical strategies that involve an identification of representative cohort-based samples through density clustering as well as employing implicit customer feedbacks to improve new customers’ experience. We demonstrate the effectiveness of our approach in a real world large scale voice assistant system for two languages, German and French through both offline experiments as well as A/B testings.

2022

pdf bib
Semi-supervised Adversarial Text Generation based on Seq2Seq models
Hieu Le | Dieu-thu Le | Verena Weber | Chris Church | Kay Rottmann | Melanie Bradford | Peter Chin
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

To improve deep learning models’ robustness, adversarial training has been frequently used in computer vision with satisfying results. However, adversarial perturbation on text have turned out to be more challenging due to the discrete nature of text. The generated adversarial text might not sound natural or does not preserve semantics, which is the key for real world applications where text classification is based on semantic meaning. In this paper, we describe a new way for generating adversarial samples by using pseudo-labeled in-domain text data to train a seq2seq model for adversarial generation and combine it with paraphrase detection. We showcase the benefit of our approach for a real-world Natural Language Understanding (NLU) task, which maps a user’s request to an intent. Furthermore, we experiment with gradient-based training for the NLU task and try using token importance scores to guide the adversarial text generation. We show that our approach can generate realistic and relevant adversarial samples compared to other state-of-the-art adversarial training methods. Applying adversarial training using these generated samples helps the NLU model to recover up to 70% of these types of errors and makes the model more robust, especially in the tail distribution in a large scale real world application.

2021

pdf bib
It is better to Verify: Semi-Supervised Learning with a human in the loop for large-scale NLU models
Verena Weber | Enrico Piovano | Melanie Bradford
Proceedings of the Second Workshop on Data Science with Human in the Loop: Language Advances

When a NLU model is updated, new utter- ances must be annotated to be included for training. However, manual annotation is very costly. We evaluate a semi-supervised learning workflow with a human in the loop in a produc- tion environment. The previous NLU model predicts the annotation of the new utterances, a human then reviews the predicted annotation. Only when the NLU prediction is assessed as incorrect the utterance is sent for human anno- tation. Experimental results show that the pro- posed workflow boosts the performance of the NLU model while significantly reducing the annotation volume. Specifically, in our setup, we see improvements of up to 14.16% for a recall-based metric and up to 9.57% for a F1- score based metric, while reducing the annota- tion volume by 97% and overall cost by 60% for each iteration.

pdf bib
The impact of domain-specific representations on BERT-based multi-domain spoken language understanding
Judith Gaspers | Quynh Do | Tobias Röding | Melanie Bradford
Proceedings of the Second Workshop on Domain Adaptation for NLP

This paper provides the first experimental study on the impact of using domain-specific representations on a BERT-based multi-task spoken language understanding (SLU) model for multi-domain applications. Our results on a real-world dataset covering three languages indicate that by using domain-specific representations learned adversarially, model performance can be improved across all of the three SLU subtasks domain classification, intent classification and slot filling. Gains are particularly large for domains with limited training data.

pdf bib
Combining semantic search and twin product classification for recognition of purchasable items in voice shopping
Dieu-Thu Le | Verena Weber | Melanie Bradford
Proceedings of the 4th Workshop on e-Commerce and NLP

The accuracy of an online shopping system via voice commands is particularly important and may have a great impact on customer trust. This paper focuses on the problem of detecting if an utterance contains actual and purchasable products, thus referring to a shopping-related intent in a typical Spoken Language Understanding architecture consist- ing of an intent classifier and a slot detec- tor. Searching through billions of products to check if a detected slot is a purchasable item is prohibitively expensive. To overcome this problem, we present a framework that (1) uses a retrieval module that returns the most rele- vant products with respect to the detected slot, and (2) combines it with a twin network that decides if the detected slot is indeed a pur- chasable item or not. Through various exper- iments, we show that this architecture outper- forms a typical slot detector approach, with a gain of +81% in accuracy and +41% in F1 score.

2020

pdf bib
To What Degree Can Language Borders Be Blurred In BERT-based Multilingual Spoken Language Understanding?
Quynh Do | Judith Gaspers | Tobias Roeding | Melanie Bradford
Proceedings of the 28th International Conference on Computational Linguistics

This paper addresses the question as to what degree a BERT-based multilingual Spoken Language Understanding (SLU) model can transfer knowledge across languages. Through experiments we will show that, although it works substantially well even on distant language groups, there is still a gap to the ideal multilingual performance. In addition, we propose a novel BERT-based adversarial model architecture to learn language-shared and language-specific representations for multilingual SLU. Our experimental results prove that the proposed model is capable of narrowing the gap to the ideal multilingual performance.