Ramesh Manuvinakurike

Also published as: Ramesh Manuvirakurike


2022

pdf bib
Cue-bot: A Conversational Agent for Assistive Technology
Shachi H Kumar | Hsuan Su | Ramesh Manuvinakurike | Maximilian C. Pinaroc | Sai Prasad | Saurav Sahay | Lama Nachman
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Intelligent conversational assistants have become an integral part of our lives for performing simple tasks. However, such agents, for example, Google bots, Alexa and others are yet to have any social impact on minority population, for example, for people with neurological disorders and people with speech, language and social communication disorders, sometimes with locked-in states where speaking or typing is a challenge. Language model technologies can be very powerful tools in enabling these users to carry out daily communication and social interactions. In this work, we present a system that users with varied levels of disabilties can use to interact with the world, supported by eye-tracking, mouse controls and an intelligent agent Cue-bot, that can represent the user in a conversation. The agent provides relevant controllable ‘cues’ to generate desirable responses quickly for an ongoing dialog context. In the context of usage of such systems for people with degenerative disorders, we present automatic and human evaluation of our cue/keyword predictor and the controllable dialog system and show that our models perform significantly better than models without control and can also reduce user effort (fewer keystrokes) and speed up communication (typing time) significantly.

pdf bib
CueBot: Cue-Controlled Response Generation for Assistive Interaction Usages
Shachi H. Kumar | Hsuan Su | Ramesh Manuvinakurike | Max Pinaroc | Sai Prasad | Saurav Sahay | Lama Nachman
Ninth Workshop on Speech and Language Processing for Assistive Technologies (SLPAT-2022)

Conversational assistants are ubiquitous among the general population, however, these systems have not had an impact on people with disabilities, or speech and language disorders, for whom basic day-to-day communication and social interaction is a huge struggle. Language model technology can play a huge role in empowering these users and help them interact with others with less effort via interaction support. To enable this population, we build a system that can represent them in a social conversation and generate responses that can be controlled by the users using cues/keywords. We build models that can speed up this communication by suggesting relevant cues in the dialog response context. We also introduce a keyword-loss to lexically constrain the model response output. We present automatic and human evaluation of our cue/keyword predictor and the controllable dialog system to show that our models perform significantly better than models without control. Our evaluation and user study shows that keyword-control on end-to-end response generation models is powerful and can enable and empower users with degenerative disorders to carry out their day-to-day communication.

2021

pdf bib
Estimating Subjective Crowd-Evaluations as an Additional Objective to Improve Natural Language Generation
Jakob Nyberg | Maike Paetzel | Ramesh Manuvinakurike
Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval)

Human ratings are one of the most prevalent methods to evaluate the performance of NLP (natural language processing) algorithms. Similarly, it is common to measure the quality of sentences generated by a natural language generation model using human raters. In this paper we argue for exploring the use of subjective evaluations within the process of training language generation models in a multi-task learning setting. As a case study, we use a crowd-authored dialogue corpus to fine-tune six different language generation models. Two of these models incorporate multi-task learning and use subjective ratings of lines as part of an explicit learning goal. A human evaluation of the generated dialogue lines reveals that utterances generated by the multi-tasking models were subjectively rated as the most typical, most moving the conversation forward, and least offensive. Based on these promising first results, we discuss future research directions for incorporating subjective human evaluations into language model training and to hence keep the human user in the loop during the development process.

pdf bib
Incremental temporal summarization in multi-party meetings
Ramesh Manuvinakurike | Saurav Sahay | Wenda Chen | Lama Nachman
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue

In this work, we develop a dataset for incremental temporal summarization in a multiparty dialogue. We use crowd-sourcing paradigm with a model-in-loop approach for collecting the summaries and compare the data with the expert summaries. We leverage the question generation paradigm to automatically generate questions from the dialogue, which can be used to validate the user participation and potentially also draw attention of the user towards the contents then need to summarize. We then develop several models for abstractive summary generation in the Incremental temporal scenario. We perform a detailed analysis of the results and show that including the past context into the summary generation yields better summaries.

pdf bib
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
Sopan Khosla | Ramesh Manuvinakurike | Vincent Ng | Massimo Poesio | Michael Strube | Carolyn Rosé
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

pdf bib
The CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue
Sopan Khosla | Juntao Yu | Ramesh Manuvinakurike | Vincent Ng | Massimo Poesio | Michael Strube | Carolyn Rosé
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

In this paper, we provide an overview of the CODI-CRAC 2021 Shared-Task: Anaphora Resolution in Dialogue. The shared task focuses on detecting anaphoric relations in different genres of conversations. Using five conversational datasets, four of which have been newly annotated with a wide range of anaphoric relations: identity, bridging references and discourse deixis, we defined multiple subtasks focusing individually on these key relations. We discuss the evaluation scripts used to assess the system performance on these subtasks, and provide a brief summary of the participating systems and the results obtained across ?? runs from 5 teams, with most submissions achieving significantly better results than our baseline methods.

pdf bib
Context or No Context? A preliminary exploration of human-in-the-loop approach for Incremental Temporal Summarization in meetings
Nicole Beckage | Shachi H Kumar | Saurav Sahay | Ramesh Manuvinakurike
Proceedings of the Third Workshop on New Frontiers in Summarization

Incremental meeting temporal summarization, summarizing relevant information of partial multi-party meeting dialogue, is emerging as the next challenge in summarization research. Here we examine the extent to which human abstractive summaries of the preceding increments (context) can be combined with extractive meeting dialogue to generate abstractive summaries. We find that previous context improves ROUGE scores. Our findings further suggest that contexts begin to outweigh the dialogue. Using keyphrase extraction and semantic role labeling (SRL), we find that SRL captures relevant information without overwhelming the the model architecture. By compressing the previous contexts by ~70%, we achieve better ROUGE scores over our baseline models. Collectively, these results suggest that context matters, as does the way in which context is presented to the model.

2020

pdf bib
RDG-Map: A Multimodal Corpus of Pedagogical Human-Agent Spoken Interactions.
Maike Paetzel | Deepthi Karkada | Ramesh Manuvinakurike
Proceedings of the 12th Language Resources and Evaluation Conference

This paper presents a multimodal corpus of 209 spoken game dialogues between a human and a remote-controlled artificial agent. The interactions involve people collaborating with the agent to identify countries on the world map as quickly as possible, which allows studying rapid and spontaneous dialogue with complex anaphoras, disfluent utterances and incorrect descriptions. The corpus consists of two parts: 8 hours of game interactions have been collected with a virtual unembodied agent online and 26.8 hours have been recorded with a physically embodied robot in a research lab. In addition to spoken audio recordings available for both parts, camera recordings and skeleton-, facial expression- and eye-gaze tracking data have been collected for the lab-based part of the corpus. In this paper, we introduce the pedagogical reference resolution game (RDG-Map) and the characteristics of the corpus collected. We also present an annotation scheme we developed in order to study the dialogue strategies utilized by the players. Based on a subset of 330 minutes of interactions annotated so far, we discuss initial insights into these strategies as well as the potential of the corpus for future research.

pdf bib
Nontrivial Lexical Convergence in a Geography-Themed Game
Amanda Bergqvist | Ramesh Manuvinakurike | Deepthi Karkada | Maike Paetzel
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

The present study aims to examine the prevalent notion that people entrain to the vocabulary of a dialogue system. Although previous research shows that people will replace their choice of words with simple substitutes, studies using more challenging substitutions are sparse. In this paper, we investigate whether people adapt their speech to the vocabulary of a dialogue system when the system’s suggested words are not direct synonyms. 32 participants played a geography-themed game with a remote-controlled agent and were primed by referencing strategies (rather than individual terms) introduced in follow-up questions. Our results suggest that context-appropriate substitutes support convergence and that the convergence has a lasting effect within a dialogue session if the system’s wording is more consistent with the norms of the domain than the original wording of the speaker.

2018

pdf bib
Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing
Ramesh Manuvinakurike | Jacqueline Brixey | Trung Bui | Walter Chang | Doo Soon Kim | Ron Artstein | Kallirroi Georgila
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
DialEdit: Annotations for Spoken Conversational Image Editing
Ramesh Manuvirakurike | Jacqueline Brixey | Trung Bui | Walter Chang | Ron Artstein | Kallirroi Georgila
Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation

pdf bib
A Dialogue Annotation Scheme for Weight Management Chat using the Trans-Theoretical Model of Health Behavior Change
Ramesh Manuvirakurike | Sumanth Bharawadj | Kallirroi Georgila
Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation

pdf bib
Towards Understanding End-of-trip Instructions in a Taxi Ride Scenario
Deepthi Karkada | Ramesh Manuvirakurike | Kallirroi Georgila
Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation

pdf bib
Conversational Image Editing: Incremental Intent Identification in a New Dialogue Task
Ramesh Manuvinakurike | Trung Bui | Walter Chang | Kallirroi Georgila
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

We present “conversational image editing”, a novel real-world application domain combining dialogue, visual information, and the use of computer vision. We discuss the importance of dialogue incrementality in this task, and build various models for incremental intent identification based on deep learning and traditional classification algorithms. We show how our model based on convolutional neural networks outperforms models based on random forests, long short term memory networks, and conditional random fields. By training embeddings based on image-related dialogue corpora, we outperform pre-trained out-of-the-box embeddings, for intention identification tasks. Our experiments also provide evidence that incremental intent processing may be more efficient for the user and could save time in accomplishing tasks.

2017

pdf bib
Using Reinforcement Learning to Model Incrementality in a Fast-Paced Dialogue Game
Ramesh Manuvinakurike | David DeVault | Kallirroi Georgila
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue

We apply Reinforcement Learning (RL) to the problem of incremental dialogue policy learning in the context of a fast-paced dialogue game. We compare the policy learned by RL with a high-performance baseline policy which has been shown to perform very efficiently (nearly as well as humans) in this dialogue game. The RL policy outperforms the baseline policy in offline simulations (based on real user data). We provide a detailed comparison of the RL policy and the baseline policy, including information about how much effort and time it took to develop each one of them. We also highlight the cases where the RL policy performs better, and show that understanding the RL policy can provide valuable insights which can inform the creation of an even better rule-based policy.

2016

pdf bib
PentoRef: A Corpus of Spoken References in Task-oriented Dialogues
Sina Zarrieß | Julian Hough | Casey Kennington | Ramesh Manuvinakurike | David DeVault | Raquel Fernández | David Schlangen
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

PentoRef is a corpus of task-oriented dialogues collected in systematically manipulated settings. The corpus is multilingual, with English and German sections, and overall comprises more than 20000 utterances. The dialogues are fully transcribed and annotated with referring expressions mapped to objects in corresponding visual scenes, which makes the corpus a rich resource for research on spoken referring expressions in generation and resolution. The corpus includes several sub-corpora that correspond to different dialogue situations where parameters related to interactivity, visual access, and verbal channel have been manipulated in systematic ways. The corpus thus lends itself to very targeted studies of reference in spontaneous dialogue.

pdf bib
Real-Time Understanding of Complex Discriminative Scene Descriptions
Ramesh Manuvinakurike | Casey Kennington | David DeVault | David Schlangen
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf bib
Toward incremental dialogue act segmentation in fast-paced interactive dialogue systems
Ramesh Manuvinakurike | Maike Paetzel | Cheng Qu | David Schlangen | David DeVault
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2015

pdf bib
“So, which one is it?” The effect of alternative incremental architectures in a high-performance game-playing agent
Maike Paetzel | Ramesh Manuvinakurike | David DeVault
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue