Yi-Chia Wang


2022

pdf bib
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems
Caleb Ziems | Jane Yu | Yi-Chia Wang | Alon Halevy | Diyi Yang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Conversational agents have come increasingly closer to human competence in open-domain dialogue settings; however, such models can reflect insensitive, hurtful, or entirely incoherent viewpoints that erode a user’s trust in the moral integrity of the system. Moral deviations are difficult to mitigate because moral judgments are not universal, and there may be multiple competing judgments that apply to a situation simultaneously. In this work, we introduce a new resource, not to authoritatively resolve moral ambiguities, but instead to facilitate systematic understanding of the intuitions, values and moral judgments reflected in the utterances of dialogue systems. The Moral Integrity Corpus, MIC, is such a resource, which captures the moral assumptions of 38k prompt-reply pairs, using 99k distinct Rules of Thumb (RoTs). Each RoT reflects a particular moral conviction that can explain why a chatbot’s reply may appear acceptable or problematic. We further organize RoTs with a set of 9 moral and social attributes and benchmark performance for attribute classification. Most importantly, we show that current neural language models can automatically generate new RoTs that reasonably describe previously unseen interactions, but they still struggle with certain scenarios. Our findings suggest that MIC will be a useful resource for understanding and language models’ implicit moral assumptions and flexibly benchmarking the integrity of conversational agents. To download the data, see https://github.com/GT-SALT/mic

2020

pdf bib
Controllable Text Generation with Focused Variation
Lei Shu | Alexandros Papangelis | Yi-Chia Wang | Gokhan Tur | Hu Xu | Zhaleh Feizollahi | Bing Liu | Piero Molino
Findings of the Association for Computational Linguistics: EMNLP 2020

This work introduces Focused-Variation Network (FVN), a novel model to control language generation. The main problems in previous controlled language generation models range from the difficulty of generating text according to the given attributes, to the lack of diversity of the generated texts. FVN addresses these issues by learning disjoint discrete latent spaces for each attribute inside codebooks, which allows for both controllability and diversity, while at the same time generating fluent text. We evaluate FVN on two text generation datasets with annotated content and style, and show state-of-the-art performance as assessed by automatic and human evaluations.

2019

pdf bib
Collaborative Multi-Agent Dialogue Model Training Via Reinforcement Learning
Alexandros Papangelis | Yi-Chia Wang | Piero Molino | Gokhan Tur
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

Some of the major challenges in training conversational agents include the lack of large-scale data of real-world complexity, defining appropriate evaluation measures, and managing meaningful conversations across many topics over long periods of time. Moreover, most works tend to assume that the conversational agent’s environment is stationary, a somewhat strong assumption. To remove this assumption and overcome the lack of data, we take a step away from the traditional training pipeline and model the conversation as a stochastic collaborative game. Each agent (player) has a role (“assistant”, “tourist”, “eater”, etc.) and their own objectives, and can only interact via language they generate. Each agent, therefore, needs to learn to operate optimally in an environment with multiple sources of uncertainty (its own LU and LG, the other agent’s LU, Policy, and LG). In this work, we present the first complete attempt at concurrently training conversational agents that communicate only via self-generated language and show that they outperform supervised and deep learning baselines.

2010

pdf bib
Making Conversational Structure Explicit: Identification of Initiation-response Pairs within Online Discussions
Yi-Chia Wang | Carolyn P. Rosé
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2007

pdf bib
A Feature Based Approach to Leveraging Context for Classifying Newsgroup Style Discussion Segments
Yi-Chia Wang | Mahesh Joshi | Carolyn Rosé
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions

2005

pdf bib
Web-Based Unsupervised Learning for Query Formulation in Question Answering
Yi-Chia Wang | Jian-Cheng Wu | Tyne Liang | Jason S. Chang
Second International Joint Conference on Natural Language Processing: Full Papers

2004

pdf bib
Using the Web as Corpus for Un-supervised Learning in Question Answering
Yi-Chia Wang | Jian-Cheng Wu | Tyne Liang | Jason S. Chang
Proceedings of the 16th Conference on Computational Linguistics and Speech Processing