Dipendra Misra

Also published as: Dipendra Kumar Misra

2018

Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations
Dipendra Misra | Ming-Wei Chang | Xiaodong He | Wen-tau Yih
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Semantic parsing from denotations faces two key challenges in model training: (1) given only the denotations (e.g., answers), search for good candidate semantic parses, and (2) choose the best model update algorithm. We propose effective and general solutions to each of them. Using policy shaping, we bias the search procedure towards semantic parses that are more compatible to the text, which provide better supervision signals for training. In addition, we propose an update equation that generalizes three different families of learning algorithms, which enables fast model exploration. When experimented on a recently proposed sequential question answering dataset, our framework leads to a new state-of-the-art model that outperforms previous work by 5.0% absolute on exact match accuracy.

pdf bib abs

Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction
Dipendra Misra | Andrew Bennett | Valts Blukis | Eyvind Niklasson | Max Shatkhin | Yoav Artzi
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We propose to decompose instruction execution to goal prediction and action generation. We design a model that maps raw visual observations to goals using LINGUNET, a language-conditioned image generation network, and then generates the actions required to complete them. Our model is trained from demonstration only without external resources. To evaluate our approach, we introduce two benchmarks for instruction following: LANI, a navigation task; and CHAI, where an agent executes household instructions. Our evaluation demonstrates the advantages of our model decomposition, and illustrates the challenges posed by our new benchmarks.

pdf bib

2017

pdf bib abs

Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
Dipendra Misra | John Langford | Yoav Artzi
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We propose to directly map raw visual observations and text input to actions for instruction execution. While existing approaches assume access to structured environment representations or use a pipeline of separately trained models, we learn a single model to jointly reason about linguistic and visual input. We use reinforcement learning in a contextual bandit setting to train a neural network agent. To guide the agent’s exploration, we use reward shaping with different forms of supervision. Our approach does not require intermediate representations, planning procedures, or training different models. We evaluate in a simulated environment, and show significant improvements over supervised learning and common reinforcement learning variants.

Dipendra Misra

2018

2017

2016

2015

Co-authors

Venues