Sam Thomson


2021

pdf bib
Compositional Generalization for Neural Semantic Parsing via Span-level Supervised Attention
Pengcheng Yin | Hao Fang | Graham Neubig | Adam Pauls | Emmanouil Antonios Platanios | Yu Su | Sam Thomson | Jacob Andreas
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We describe a span-level supervised attention loss that improves compositional generalization in semantic parsers. Our approach builds on existing losses that encourage attention maps in neural sequence-to-sequence models to imitate the output of classical word alignment algorithms. Where past work has used word-level alignments, we focus on spans; borrowing ideas from phrase-based machine translation, we align subtrees in semantic parses to spans of input sentences, and encourage neural attention mechanisms to mimic these alignments. This method improves the performance of transformers, RNNs, and structured decoders on three benchmarks of compositional generalization.

pdf bib
Value-Agnostic Conversational Semantic Parsing
Emmanouil Antonios Platanios | Adam Pauls | Subhro Roy | Yuchen Zhang | Alexander Kyte | Alan Guo | Sam Thomson | Jayant Krishnamurthy | Jason Wolfe | Jacob Andreas | Dan Klein
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Conversational semantic parsers map user utterances to executable programs given dialogue histories composed of previous utterances, programs, and system responses. Existing parsers typically condition on rich representations of history that include the complete set of values and computations previously discussed. We propose a model that abstracts over values to focus prediction on type- and function-level context. This approach provides a compact encoding of dialogue histories and predicted programs, improving generalization and computational efficiency. Our model incorporates several other components, including an atomic span copy operation and structural enforcement of well-formedness constraints on predicted programs, that are particularly advantageous in the low-data regime. Trained on the SMCalFlow and TreeDST datasets, our model outperforms prior work by 7.3% and 10.6% respectively in terms of absolute accuracy. Trained on only a thousand examples from each dataset, it outperforms strong baselines by 12.4% and 6.4%. These results indicate that simple representations are key to effective generalization in conversational semantic parsing.

2020

pdf bib
Task-Oriented Dialogue as Dataflow Synthesis
Jacob Andreas | John Bufe | David Burkett | Charles Chen | Josh Clausman | Jean Crawford | Kate Crim | Jordan DeLoach | Leah Dorner | Jason Eisner | Hao Fang | Alan Guo | David Hall | Kristin Hayes | Kellie Hill | Diana Ho | Wendy Iwaszuk | Smriti Jha | Dan Klein | Jayant Krishnamurthy | Theo Lanman | Percy Liang | Christopher H. Lin | Ilya Lintsbakh | Andy McGovern | Aleksandr Nisnevich | Adam Pauls | Dmitrij Petters | Brent Read | Dan Roth | Subhro Roy | Jesse Rusak | Beth Short | Div Slomin | Ben Snyder | Stephon Striplin | Yu Su | Zachary Tellman | Sam Thomson | Andrei Vorobev | Izabela Witoszko | Jason Wolfe | Abby Wray | Yuchen Zhang | Alexander Zotov
Transactions of the Association for Computational Linguistics, Volume 8

We describe an approach to task-oriented dialogue in which dialogue state is represented as a dataflow graph. A dialogue agent maps each user utterance to a program that extends this graph. Programs include metacomputation operators for reference and revision that reuse dataflow fragments from previous turns. Our graph-based state enables the expression and manipulation of complex user intents, and explicit metacomputation makes these intents easier for learned models to predict. We introduce a new dataset, SMCalFlow, featuring complex dialogues about events, weather, places, and people. Experiments show that dataflow graphs and metacomputation substantially improve representability and predictability in these natural dialogues. Additional experiments on the MultiWOZ dataset show that our dataflow representation enables an otherwise off-the-shelf sequence-to-sequence model to match the best existing task-specific state tracking model. The SMCalFlow dataset, code for replicating experiments, and a public leaderboard are available at https://www.microsoft.com/en-us/research/project/dataflow-based-dialogue-semantic-machines.

2018

pdf bib
Learning Joint Semantic Parsers from Disjoint Data
Hao Peng | Sam Thomson | Swabha Swayamdipta | Noah A. Smith
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We present a new approach to learning a semantic parser from multiple datasets, even when the target semantic formalisms are drastically different and the underlying corpora do not overlap. We handle such “disjoint” data by treating annotations for unobserved formalisms as latent structured variables. Building on state-of-the-art baselines, we show improvements both in frame-semantic parsing and semantic dependency parsing by modeling them jointly.

pdf bib
Bridging CNNs, RNNs, and Weighted Finite-State Machines
Roy Schwartz | Sam Thomson | Noah A. Smith
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recurrent and convolutional neural networks comprise two distinct families of models that have proven to be useful for encoding natural language utterances. In this paper we present SoPa, a new model that aims to bridge these two approaches. SoPa combines neural representation learning with weighted finite-state automata (WFSAs) to learn a soft version of traditional surface patterns. We show that SoPa is an extension of a one-layer CNN, and that such CNNs are equivalent to a restricted version of SoPa, and accordingly, to a restricted form of WFSA. Empirically, on three text classification tasks, SoPa is comparable or better than both a BiLSTM (RNN) baseline and a CNN baseline, and is particularly useful in small data settings.

pdf bib
Backpropagating through Structured Argmax using a SPIGOT
Hao Peng | Sam Thomson | Noah A. Smith
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We introduce structured projection of intermediate gradients (SPIGOT), a new method for backpropagating through neural networks that include hard-decision structured predictions (e.g., parsing) in intermediate layers. SPIGOT requires no marginal inference, unlike structured attention networks and reinforcement learning-inspired solutions. Like so-called straight-through estimators, SPIGOT defines gradient-like quantities associated with intermediate nondifferentiable operations, allowing backpropagation before and after them; SPIGOT’s proxy aims to ensure that, after a parameter update, the intermediate structure will remain well-formed. We experiment on two structured NLP pipelines: syntactic-then-semantic dependency parsing, and semantic parsing followed by sentiment classification. We show that training with SPIGOT leads to a larger improvement on the downstream task than a modularly-trained pipeline, the straight-through estimator, and structured attention, reaching a new state of the art on semantic dependency parsing.

pdf bib
Rational Recurrences
Hao Peng | Roy Schwartz | Sam Thomson | Noah A. Smith
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Despite the tremendous empirical success of neural models in natural language processing, many of them lack the strong intuitions that accompany classical machine learning approaches. Recently, connections have been shown between convolutional neural networks (CNNs) and weighted finite state automata (WFSAs), leading to new interpretations and insights. In this work, we show that some recurrent neural networks also share this connection to WFSAs. We characterize this connection formally, defining rational recurrences to be recurrent hidden state update functions that can be written as the Forward calculation of a finite set of WFSAs. We show that several recent neural models use rational recurrences. Our analysis provides a fresh view of these models and facilitates devising new neural architectures that draw inspiration from WFSAs. We present one such model, which performs better than two recent baselines on language modeling and text classification. Our results demonstrate that transferring intuitions from classical models like WFSAs can be an effective approach to designing and understanding neural models.

pdf bib
Syntactic Scaffolds for Semantic Structures
Swabha Swayamdipta | Sam Thomson | Kenton Lee | Luke Zettlemoyer | Chris Dyer | Noah A. Smith
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We introduce the syntactic scaffold, an approach to incorporating syntactic information into semantic tasks. Syntactic scaffolds avoid expensive syntactic processing at runtime, only making use of a treebank during training, through a multitask objective. We improve over strong baselines on PropBank semantics, frame semantics, and coreference resolution, achieving competitive performance on all three tasks.

2017

pdf bib
Deep Multitask Learning for Semantic Dependency Parsing
Hao Peng | Sam Thomson | Noah A. Smith
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a deep neural architecture that parses sentences into three semantic dependency graph formalisms. By using efficient, nearly arc-factored inference and a bidirectional-LSTM composed with a multi-layer perceptron, our base system is able to significantly improve the state of the art for semantic dependency parsing, without using hand-engineered features or syntax. We then explore two multitask learning approaches—one that shares parameters across formalisms, and one that uses higher-order structures to predict the graphs jointly. We find that both approaches improve performance across formalisms on average, achieving a new state of the art. Our code is open-source and available at https://github.com/Noahs-ARK/NeurboParser.

2015

pdf bib
Frame-Semantic Role Labeling with Heterogeneous Annotations
Meghana Kshirsagar | Sam Thomson | Nathan Schneider | Jaime Carbonell | Noah A. Smith | Chris Dyer
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
Toward Abstractive Summarization Using Semantic Representations
Fei Liu | Jeffrey Flanigan | Sam Thomson | Norman Sadeh | Noah A. Smith
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
CMU: Arc-Factored, Discriminative Semantic Dependency Parsing
Sam Thomson | Brendan O’Connor | Jeffrey Flanigan | David Bamman | Jesse Dodge | Swabha Swayamdipta | Nathan Schneider | Chris Dyer | Noah A. Smith
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
A Discriminative Graph-Based Parser for the Abstract Meaning Representation
Jeffrey Flanigan | Sam Thomson | Jaime Carbonell | Chris Dyer | Noah A. Smith
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)