Susan Robinson


2010

As conversational agents are now being developed to encounter more complex dialogue situations it is increasingly difficult to find satisfactory methods for evaluating these agents. Task-based measures are insufficient where there is no clearly defined task. While user-based evaluation methods may give a general sense of the quality of an agent's performance, they shed little light on the relative quality or success of specific features of dialogue that are necessary for system improvement. This paper examines current dialogue agent evaluation practices and motivates the need for a more detailed approach for defining and measuring the quality of dialogues between agent and user. We present a framework for evaluating the dialogue competence of artificial agents involved in complex and underspecified tasks when conversing with people. A multi-part coding scheme is proposed that provides a qualitative analysis of human utterances, and rates the appropriateness of the agent's responses to these utterances. The scheme is outlined, and then used to evaluate Staff Duty Officer Moleno, a virtual guide in Second Life.

2008

When dealing with large, distributed systems that use state-of-the-art components, individual components are usually developed in parallel. As development continues, the decoupling invariably leads to a mismatch between how these components internally represent concepts and how they communicate these representations to other components: representations can get out of synch, contain localized errors, or become manageable only by a small group of experts for each module. In this paper, we describe the use of an ontology as part of a complex distributed virtual human architecture in order to enable better communication between modules while improving the overall flexibility needed to change or extend the system. We focus on the natural language understanding capabilities of this architecture and the relationship between language and concepts within the entire system in general and the ontology in particular.
Embodied Conversational Agents have typically been constructed for use in limited domain applications, and tested in very specialized environments. Only in recent years have there been more cases of moving agents into wider public applications (e.g.Bell et al., 2003; Kopp et al., 2005). Yet little analysis has been done to determine the differing needs, expectations, and behavior of human users in these environments. With an increasing trend for virtual characters to “go public”, we need to expand our understanding of what this entails for the design and capabilities of our characters. This paper explores these issues through an analysis of a corpus that has been collected since December 2006, from interactions with the virtual character Sgt Blackwell at the Cooper Hewitt Museum in New York. The analysis includes 82 hierarchical categories of user utterances, as well as specific observations on user preferences and behaviors drawn from interactions with Blackwell.

2007

2005

2004