Giovanni Campagna


2023

pdf bib
Contextual Semantic Parsing for Multilingual Task-Oriented Dialogues
Mehrad Moradshahi | Victoria Tsai | Giovanni Campagna | Monica Lam
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Robust state tracking for task-oriented dialogue systems currently remains restricted to a few popular languages. This paper shows that given a large-scale dialogue data set in one language, we can automatically produce an effective semantic parser for other languages using machine translation. We propose automatic translation of dialogue datasets with alignment to ensure faithful translation of slot values and eliminate costly human supervision used in previous benchmarks. We also propose a new contextual semantic parsing model, which encodes the formal slots and values, and only the last agent and user utterances. We show that the succinct representation reduces the compounding effect of translation errors, without harming the accuracy in practice. We evaluate our approach on several dialogue state tracking benchmarks. On RiSAWOZ, CrossWOZ, CrossWOZ-EN, and MultiWOZ-ZH datasets we improve the state of the art by 11%, 17%, 20%, and 0.3% in joint goal accuracy. We present a comprehensive error analysis for all three datasets showing erroneous annotations can lead to misguided judgments on the quality of the model. Finally, we present RiSAWOZ English and German datasets, created using our translation methodology. On these datasets, accuracy is within 11% of the original showing that high-accuracy multilingual dialogue datasets are possible without relying on expensive human annotations. We release our datasets and software open source.

pdf bib
BYOC: Personalized Few-Shot Classification with Co-Authored Class Descriptions
Arth Bohra | Govert Verkes | Artem Harutyunyan | Pascal Weinberger | Giovanni Campagna
Findings of the Association for Computational Linguistics: EMNLP 2023

Text classification is a well-studied and versatile building block for many NLP applications. Yet, existing approaches require either large annotated corpora to train a model with or, when using large language models as a base, require carefully crafting the prompt as well as using a long context that can fit many examples. As a result, it is not possible for end-users to build classifiers for themselves. To address this issue, we propose a novel approach to few-shot text classification using an LLM. Rather than few-shot examples, the LLM is prompted with descriptions of the salient features of each class. These descriptions are coauthored by the user and the LLM interactively: while the user annotates each few-shot example, the LLM asks relevant questions that the user answers. Examples, questions, and answers are summarized to form the classification prompt. Our experiments show that our approach yields high accuracy classifiers, within 79% of the performance of models trained with significantly larger datasets while using only 1% of their training sets. Additionally, in a study with 30 participants, we show that end-users are able to build classifiers to suit their specific needs. The personalized classifiers show an average accuracy of 90%, which is 15% higher than the state-of-the-art approach.

2022

pdf bib
Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent
Ethan A. Chi | Ashwin Paranjape | Abigail See | Caleb Chiam | Trenton Chang | Kathleen Kenealy | Swee Kiat Lim | Amelia Hardy | Chetanya Rastogi | Haojun Li | Alexander Iyabor | Yutong He | Hari Sowrirajan | Peng Qi | Kaushik Ram Sadagopan | Nguyet Minh Phu | Dilara Soylu | Jillian Tang | Avanika Narayan | Giovanni Campagna | Christopher Manning
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

We present Chirpy Cardinal, an open-domain social chatbot. Aiming to be both informative and conversational, our bot chats with users in an authentic, emotionally intelligent way. By integrating controlled neural generation with scaffolded, hand-written dialogue, we let both the user and bot take turns driving the conversation, producing an engaging and socially fluent experience. Deployed in the fourth iteration of the Alexa Prize Socialbot Grand Challenge, Chirpy Cardinal handled thousands of conversations per day, placing second out of nine bots with an average user rating of 3.58/5.

pdf bib
A Few-Shot Semantic Parser for Wizard-of-Oz Dialogues with the Precise ThingTalk Representation
Giovanni Campagna | Sina Semnani | Ryan Kearns | Lucas Jun Koba Sato | Silei Xu | Monica Lam
Findings of the Association for Computational Linguistics: ACL 2022

Previous attempts to build effective semantic parsers for Wizard-of-Oz (WOZ) conversations suffer from the difficulty in acquiring a high-quality, manually annotated training set. Approaches based only on dialogue synthesis are insufficient, as dialogues generated from state-machine based models are poor approximations of real-life conversations. Furthermore, previously proposed dialogue state representations are ambiguous and lack the precision necessary for building an effective agent. This paper proposes a new dialogue representation and a sample-efficient methodology that can predict precise dialogue states in WOZ conversations. We extended the ThingTalk representation to capture all information an agent needs to respond properly. Our training strategy is sample-efficient: we combine (1) few-shot data sparsely sampling the full dialogue space and (2) synthesized data covering a subset space of dialogues generated by a succinct state-based dialogue model. The completeness of the extended ThingTalk language is demonstrated with a fully operational agent, which is also used in training data synthesis. We demonstrate the effectiveness of our methodology on MultiWOZ 3.0, a reannotation of the MultiWOZ 2.1 dataset in ThingTalk. ThingTalk can represent 98% of the test turns, while the simulator can emulate 85% of the validation set. We train a contextual semantic parser using our strategy, and obtain 79% turn-by-turn exact match accuracy on the reannotated test set.

2021

pdf bib
Grounding Open-Domain Instructions to Automate Web Support Tasks
Nancy Xu | Sam Masling | Michael Du | Giovanni Campagna | Larry Heck | James Landay | Monica Lam
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Grounding natural language instructions on the web to perform previously unseen tasks enables accessibility and automation. We introduce a task and dataset to train AI agents from open-domain, step-by-step instructions originally written for people. We build RUSS (Rapid Universal Support Service) to tackle this problem. RUSS consists of two models: First, a BERT-LSTM with pointers parses instructions to WebLang, a domain-specific language we design for grounding natural language on the web. Then, a grounding model retrieves the unique IDs of any webpage elements requested in the WebLang. RUSS may interact with the user through a dialogue (e.g. ask for an address) or execute a web operation (e.g. click a button) inside the web runtime. To augment training, we synthesize natural language instructions mapped to WebLang. Our dataset consists of 80 different customer service problems from help websites, with a total of 741 step-by-step instructions and their corresponding actions. RUSS achieves 76.7% end-to-end accuracy predicting agent actions from single instructions. It outperforms state-of-the-art models that directly map instructions to actions without WebLang. Our user study shows that RUSS is preferred by actual users over web navigation.

2020

pdf bib
AutoQA: From Databases To QA Semantic Parsers With Only Synthetic Training Data
Silei Xu | Sina Semnani | Giovanni Campagna | Monica Lam
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose AutoQA, a methodology and toolkit to generate semantic parsers that answer questions on databases, with no manual effort. Given a database schema and its data, AutoQA automatically generates a large set of high-quality questions for training that covers different database operations. It uses automatic paraphrasing combined with template-based parsing to find alternative expressions of an attribute in different parts of speech. It also uses a novel filtered auto-paraphraser to generate correct paraphrases of entire sentences. We apply AutoQA to the Schema2QA dataset and obtain an average logical form accuracy of 62.9% when tested on natural questions, which is only 6.4% lower than a model trained with expert natural language annotations and paraphrase data collected from crowdworkers. To demonstrate the generality of AutoQA, we also apply it to the Overnight dataset. AutoQA achieves 69.8% answer accuracy, 16.4% higher than the state-of-the-art zero-shot models and only 5.2% lower than the same model trained with human data.

pdf bib
Localizing Open-Ontology QA Semantic Parsers in a Day Using Machine Translation
Mehrad Moradshahi | Giovanni Campagna | Sina Semnani | Silei Xu | Monica Lam
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose Semantic Parser Localizer (SPL), a toolkit that leverages Neural Machine Translation (NMT) systems to localize a semantic parser for a new language. Our methodology is to (1) generate training data automatically in the target language by augmenting machine-translated datasets with local entities scraped from public websites, (2) add a few-shot boost of human-translated sentences and train a novel XLMR-LSTM semantic parser, and (3) test the model on natural utterances curated using human translators. We assess the effectiveness of our approach by extending the current capabilities of Schema2QA, a system for English Question Answering (QA) on the open web, to 10 new languages for the restaurants and hotels domains. Our model achieves an overall test accuracy ranging between 61% and 69% for the hotels domain and between 64% and 78% for restaurants domain, which compares favorably to 69% and 80% obtained for English parser trained on gold English data and a few examples from validation set. We show our approach outperforms the previous state-of-the-art methodology by more than 30% for hotels and 40% for restaurants with localized ontologies for the subset of languages tested. Our methodology enables any software developer to add a new language capability to a QA system for a new domain, leveraging machine translation, in less than 24 hours. Our code is released open-source.

pdf bib
Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking
Giovanni Campagna | Agata Foryciarz | Mehrad Moradshahi | Monica Lam
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Zero-shot transfer learning for multi-domain dialogue state tracking can allow us to handle new domains without incurring the high cost of data acquisition. This paper proposes new zero-short transfer learning technique for dialogue state tracking where the in-domain training data are all synthesized from an abstract dialogue model and the ontology of the domain. We show that data augmentation through synthesized data can improve the accuracy of zero-shot learning for both the TRADE model and the BERT-based SUMBT model on the MultiWOZ 2.1 dataset. We show training with only synthesized in-domain data on the SUMBT model can reach about 2/3 of the accuracy obtained with the full training dataset. We improve the zero-shot learning state of the art on average across domains by 21%.