Srinivasan Iyer


2021

pdf bib
RECONSIDER: Improved Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering
Srinivasan Iyer | Sewon Min | Yashar Mehdad | Wen-tau Yih
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

State-of-the-art Machine Reading Comprehension (MRC) models for Open-domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples. This training scheme possibly explains empirical observations that these models achieve a high recall amongst their top few predictions, but a low overall accuracy, motivating the need for answer re-ranking. We develop a successful re-ranking approach (RECONSIDER) for span-extraction tasks that improves upon the performance of MRC models, even beyond large-scale pre-training. RECONSIDER is trained on positive and negative examples extracted from high confidence MRC model predictions, and uses in-passage span annotations to perform span-focused re-ranking over a smaller candidate set. As a result, RECONSIDER learns to eliminate close false positives, achieving a new extractive state of the art on four QA tasks, with 45.5% Exact Match accuracy on Natural Questions with real user questions, and 61.7% on TriviaQA. We will release all related data, models, and code.

pdf bib
Do Explanations Help Users Detect Errors in Open-Domain QA? An Evaluation of Spoken vs. Visual Explanations
Ana Valeria González | Gagan Bansal | Angela Fan | Yashar Mehdad | Robin Jia | Srinivasan Iyer
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
EASE: Extractive-Abstractive Summarization End-to-End using the Information Bottleneck Principle
Haoran Li | Arash Einolghozati | Srinivasan Iyer | Bhargavi Paranjape | Yashar Mehdad | Sonal Gupta | Marjan Ghazvininejad
Proceedings of the Third Workshop on New Frontiers in Summarization

Current abstractive summarization systems outperform their extractive counterparts, but their widespread adoption is inhibited by the inherent lack of interpretability. Extractive summarization systems, though interpretable, suffer from redundancy and possible lack of coherence. To achieve the best of both worlds, we propose EASE, an extractive-abstractive framework that generates concise abstractive summaries that can be traced back to an extractive summary. Our framework can be applied to any evidence-based text generation problem and can accommodate various pretrained models in its simple architecture. We use the Information Bottleneck principle to jointly train the extraction and abstraction in an end-to-end fashion. Inspired by previous research that humans use a two-stage framework to summarize long documents (Jing and McKeown, 2000), our framework first extracts a pre-defined amount of evidence spans and then generates a summary using only the evidence. Using automatic and human evaluations, we show that the generated summaries are better than strong extractive and extractive-abstractive baselines.

2020

pdf bib
Efficient One-Pass End-to-End Entity Linking for Questions
Belinda Z. Li | Sewon Min | Srinivasan Iyer | Yashar Mehdad | Wen-tau Yih
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We present ELQ, a fast end-to-end entity linking model for questions, which uses a biencoder to jointly perform mention detection and linking in one pass. Evaluated on WebQSP and GraphQuestions with extended annotations that cover multiple entities per question, ELQ outperforms the previous state of the art by a large margin of +12.7% and +19.6% F1, respectively. With a very fast inference time (1.57 examples/s on a single CPU), ELQ can be useful for downstream question answering systems. In a proof-of-concept experiment, we demonstrate that using ELQ significantly improves the downstream QA performance of GraphRetriever.

pdf bib
Proceedings of the First Workshop on Interactive and Executable Semantic Parsing
Ben Bogin | Srinivasan Iyer | Victoria Lin | Dragomir Radev | Alane Suhr | Panupong | Caiming Xiong | Pengcheng Yin | Tao Yu | Rui Zhang | Victor Zhong
Proceedings of the First Workshop on Interactive and Executable Semantic Parsing

2019

pdf bib
Learning Programmatic Idioms for Scalable Semantic Parsing
Srinivasan Iyer | Alvin Cheung | Luke Zettlemoyer
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Programmers typically organize executable source code using high-level coding patterns or idiomatic structures such as nested loops, exception handlers and recursive blocks, rather than as individual code tokens. In contrast, state of the art (SOTA) semantic parsers still map natural language instructions to source code by building the code syntax tree one node at a time. In this paper, we introduce an iterative method to extract code idioms from large source code corpora by repeatedly collapsing most-frequent depth-2 subtrees of their syntax trees, and train semantic parsers to apply these idioms during decoding. Applying idiom-based decoding on a recent context-dependent semantic parsing task improves the SOTA by 2.2% BLEU score while reducing training time by more than 50%. This improved speed enables us to scale up the model by training on an extended training set that is 5× larger, to further move up the SOTA by an additional 2.3% BLEU and 0.9% exact match. Finally, idioms also significantly improve accuracy of semantic parsing to SQL on the ATIS-SQL dataset, when training data is limited.

pdf bib
JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generation
Rajas Agashe | Srinivasan Iyer | Luke Zettlemoyer
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Interactive programming with interleaved code snippet cells and natural language markdown is recently gaining popularity in the form of Jupyter notebooks, which accelerate prototyping and collaboration. To study code generation conditioned on a long context history, we present JuICe, a corpus of 1.5 million examples with a curated test set of 3.7K instances based on online programming assignments. Compared with existing contextual code generation datasets, JuICe provides refined human-curated data, open-domain code, and an order of magnitude more training data. Using JuICe, we train models for two tasks: (1) generation of the API call sequence in a code cell, and (2) full code cell generation, both conditioned on the NL-Code history up to a particular code cell. Experiments using current baseline code generation models show that both context and distant supervision aid in generation, and that the dataset is challenging for current systems.

pdf bib
Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation
Antoine Bosselut | Asli Celikyilmaz | Marjan Ghazvininejad | Srinivasan Iyer | Urvashi Khandelwal | Hannah Rashkin | Thomas Wolf
Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation

2018

pdf bib
Mapping Language to Code in Programmatic Context
Srinivasan Iyer | Ioannis Konstas | Alvin Cheung | Luke Zettlemoyer
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Source code is rarely written in isolation. It depends significantly on the programmatic context, such as the class that the code would reside in. To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class. This task is challenging because the desired code can vary greatly depending on the functionality the class provides (e.g., a sort function may or may not be available when we are asked to “return the smallest element” in a particular member variable list). We introduce CONCODE, a new large dataset with over 100,000 examples consisting of Java classes from online code repositories, and develop a new encoder-decoder architecture that models the interaction between the method documentation and the class environment. We also present a detailed error analysis suggesting that there is significant room for future work on this task.

pdf bib
Neural Semantic Parsing
Matt Gardner | Pradeep Dasigi | Srinivasan Iyer | Alane Suhr | Luke Zettlemoyer
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

Semantic parsing, the study of translating natural language utterances into machine-executable programs, is a well-established research area and has applications in question answering, instruction following, voice assistants, and code generation. In the last two years, the models used for semantic parsing have changed dramatically with the introduction of neural encoder-decoder methods that allow us to rethink many of the previous assumptions underlying semantic parsing. We aim to inform those already interested in semantic parsing research of these new developments in the field, as well as introduce the topic as an exciting research area to those who are unfamiliar with it. Current approaches for neural semantic parsing share several similarities with neural machine translation, but the key difference between the two fields is that semantic parsing translates natural language into a formal language, while machine translation translates it into a different natural language. The formal language used in semantic parsing allows for constrained decoding, where the model is constrained to only produce outputs that are valid formal statements. We will describe the various approaches researchers have taken to do this. We will also discuss the choice of formal languages used by semantic parsers, and describe why much recent work has chosen to use standard programming languages instead of more linguistically-motivated representations. We will then describe a particularly challenging setting for semantic parsing, where there is additional context or interaction that the parser must take into account when translating natural language to formal language, and give an overview of recent work in this direction. Finally, we will introduce some tools available in AllenNLP for doing semantic parsing research.

pdf bib
Learning to Map Context-Dependent Sentences to Executable Formal Queries
Alane Suhr | Srinivasan Iyer | Yoav Artzi
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We propose a context-dependent model to map utterances within an interaction to executable formal queries. To incorporate interaction history, the model maintains an interaction-level encoder that updates after each turn, and can copy sub-sequences of previously predicted queries during generation. Our approach combines implicit and explicit modeling of references between utterances. We evaluate our model on the ATIS flight planning interactions, and demonstrate the benefits of modeling context and explicit references.

2017

pdf bib
Neural AMR: Sequence-to-Sequence Models for Parsing and Generation
Ioannis Konstas | Srinivasan Iyer | Mark Yatskar | Yejin Choi | Luke Zettlemoyer
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Sequence-to-sequence models have shown strong performance across a broad range of applications. However, their application to parsing and generating text using Abstract Meaning Representation (AMR) has been limited, due to the relatively limited amount of labeled data and the non-sequential nature of the AMR graphs. We present a novel training procedure that can lift this limitation using millions of unlabeled sentences and careful preprocessing of the AMR graphs. For AMR parsing, our model achieves competitive results of 62.1 SMATCH, the current best score reported without significant use of external semantic resources. For AMR generation, our model establishes a new state-of-the-art performance of BLEU 33.8. We present extensive ablative and qualitative analysis including strong evidence that sequence-based AMR models are robust against ordering variations of graph-to-sequence conversions.

pdf bib
Learning a Neural Semantic Parser from User Feedback
Srinivasan Iyer | Ioannis Konstas | Alvin Cheung | Jayant Krishnamurthy | Luke Zettlemoyer
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention. To achieve this, we adapt neural sequence models to map utterances directly to SQL with its full expressivity, bypassing any intermediate meaning representations. These models are immediately deployed online to solicit feedback from real users to flag incorrect queries. Finally, the popularity of SQL facilitates gathering annotations for incorrect predictions using the crowd, which is directly used to improve our models. This complete feedback loop, without intermediate representations or database specific engineering, opens up new ways of building high quality semantic parsers. Experiments suggest that this approach can be deployed quickly for any new target domain, as we show by learning a semantic parser for an online academic database from scratch.

2016

pdf bib
Summarizing Source Code using a Neural Attention Model
Srinivasan Iyer | Ioannis Konstas | Alvin Cheung | Luke Zettlemoyer
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)