Arun Tejasvi Chaganty
2023
RARR: Researching and Revising What Language Models Say, Using Language Models
Luyu Gao
|
Zhuyun Dai
|
Panupong Pasupat
|
Anthony Chen
|
Arun Tejasvi Chaganty
|
Yicheng Fan
|
Vincent Zhao
|
Ni Lao
|
Hongrae Lee
|
Da-Cheng Juan
|
Kelvin Guu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Language models (LMs) now excel at many tasks such as question answering, reasoning, and dialog. However, they sometimes generate unsupported or misleading content. A user cannot easily determine whether their outputs are trustworthy or not, because most LMs do not have any built-in mechanism for attribution to external evidence. To enable attribution while still preserving all the powerful advantages of recent generation models, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically finds attribution for the output of any text generation model, and 2) post-edits the output to fix unsupported content while preserving the original output as much as possible. When applied to the output of several state-of-the-art LMs on a diverse set of generation tasks, we find that RARR significantly improves attribution while otherwise preserving the original input to a much greater degree than previously explored edit models. Furthermore, the implementation of RARR requires only a handful of training examples, a large language model, and standard web search.
2019
Mimic and Rephrase: Reflective Listening in Open-Ended Dialogue
Justin Dieter
|
Tian Wang
|
Arun Tejasvi Chaganty
|
Gabor Angeli
|
Angel X. Chang
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
Reflective listening–demonstrating that you have heard your conversational partner–is key to effective communication. Expert human communicators often mimic and rephrase their conversational partner, e.g., when responding to sentimental stories or to questions they don’t know the answer to. We introduce a new task and an associated dataset wherein dialogue agents similarly mimic and rephrase a user’s request to communicate sympathy (I’m sorry to hear that) or lack of knowledge (I do not know that). We study what makes a rephrasal response good against a set of qualitative metrics. We then evaluate three models for generating responses: a syntax-aware rule-based system, a seq2seq LSTM neural models with attention (S2SA), and the same neural model augmented with a copy mechanism (S2SA+C). In a human evaluation, we find that S2SA+C and the rule-based system are comparable and approach human-generated response quality. In addition, experiences with a live deployment of S2SA+C in a customer support setting suggest that this generation task is a practical contribution to real world conversational agents.
Search
Fix data
Co-authors
- Gabor Angeli 1
- Angel Chang 1
- Anthony Chen 1
- Zhuyun Dai 1
- Justin Dieter 1
- show all...