2024
pdf
bib
abs
HelloThere: A Corpus of Annotated Dialogues and Knowledge Bases of Time-Offset Avatars
Alberto Chierici
|
Nizar Habash
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
A Time-Offset Interaction Application (TOIA) is a software system that allows people to engage in face-to-face dialogue with previously recorded videos of other people. There are two TOIA usage modes: (a) creation mode, where users pre-record video snippets of themselves representing their answers to possible questions someone may ask them, and (b) interaction mode, where other users of the system can choose to interact with created avatars. This paper presents the HelloThere corpus that has been collected from two user studies involving several people who recorded avatars and many more who engaged in dialogues with them. The interactions with avatars are annotated by people asking them questions through three modes (card selection, text search, and voice input) and rating the appropriateness of their answers on a 1 to 5 scale. The corpus, made available to the research community, comprises 26 avatars’ knowledge bases and 317 dialogues between 64 interrogators and the avatars in text format.
2021
pdf
bib
abs
A View From the Crowd: Evaluation Challenges for Time-Offset Interaction Applications
Alberto Chierici
|
Nizar Habash
Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)
Dialogue systems like chatbots, and tasks like question-answering (QA) have gained traction in recent years; yet evaluating such systems remains difficult. Reasons include the great variety in contexts and use cases for these systems as well as the high cost of human evaluation. In this paper, we focus on a specific type of dialogue systems: Time-Offset Interaction Applications (TOIAs) are intelligent, conversational software that simulates face-to-face conversations between humans and pre-recorded human avatars. Under the constraint that a TOIA is a single output system interacting with users with different expectations, we identify two challenges: first, how do we define a ‘good’ answer? and second, what’s an appropriate metric to use? We explore both challenges through the creation of a novel dataset that identifies multiple good answers to specific TOIA questions through the help of Amazon Mechanical Turk workers. This ‘view from the crowd’ allows us to study the variations of how TOIA interrogators perceive its answers. Our contributions include the annotated dataset that we make publicly available and the proposal of Success Rate @k as an evaluation metric that is more appropriate than the traditional QA’s and information retrieval’s metrics.
pdf
bib
abs
A Cloud-based User-Centered Time-Offset Interaction Application
Alberto Chierici
|
Tyeece Kiana Fredorcia Hensley
|
Wahib Kamran
|
Kertu Koss
|
Armaan Agrawal
|
Erin Meekhof
|
Goffredo Puccetti
|
Nizar Habash
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue
Time-offset interaction applications (TOIA) allow simulating conversations with people who have previously recorded relevant video utterances, which are played in response to their interacting user. TOIAs have great potential for preserving cross-generational and cross-cultural histories, online teaching, simulated interviews, etc. Current TOIAs exist in niche contexts involving high production costs. Democratizing TOIA presents different challenges when creating appropriate pre-recordings, designing different user stories, and creating simple online interfaces for experimentation. We open-source TOIA 2.0, a user-centered time-offset interaction application, and make it available for everyone who wants to interact with people’s pre-recordings, or create their pre-recordings.
2020
pdf
bib
abs
The Margarita Dialogue Corpus: A Data Set for Time-Offset Interactions and Unstructured Dialogue Systems
Alberto Chierici
|
Nizar Habash
|
Margarita Bicec
Proceedings of the Twelfth Language Resources and Evaluation Conference
Time-Offset Interaction Applications (TOIAs) are systems that simulate face-to-face conversations between humans and digital human avatars recorded in the past. Developing a well-functioning TOIA involves several research areas: artificial intelligence, human-computer interaction, natural language processing, question answering, and dialogue systems. The first challenges are to define a sensible methodology for data collection and to create useful data sets for training the system to retrieve the best answer to a user’s question. In this paper, we present three main contributions: a methodology for creating the knowledge base for a TOIA, a dialogue corpus, and baselines for single-turn answer retrieval. We develop the methodology using a two-step strategy. First, we let the avatar maker list pairs by intuition, guessing what possible questions a user may ask to the avatar. Second, we record actual dialogues between random individuals and the avatar-maker. We make the Margarita Dialogue Corpus available to the research community. This corpus comprises the knowledge base in text format, the video clips for each answer, and the annotated dialogues.