Stuart M. Shieber

Also published as: Stuart Shieber

2024

pdf bib abs
string2string: A Modern Python Library for String-to-String Algorithms
Mirac Suzgun | Stuart Shieber | Dan Jurafsky
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

We introduce **string2string**, an open-source library that offers a comprehensive suite of efficient algorithms for a broad range of string-to-string problems. It includes traditional algorithmic solutions as well as recent advanced neural approaches to tackle various problems in string alignment, distance measurement, lexical and semantic search, and similarity analysis�along with several helpful visualization tools and metrics to facilitate the interpretation and analysis of these methods. Notable algorithms featured in the library include the Smith-Waterman algorithm for pairwise local alignment, the Hirschberg algorithm for global alignment, the Wagner-Fischer algorithm for edit distance, BARTScore and BERTScore for similarity analysis, the Knuth-Morris-Pratt algorithm for lexical search, and Faiss for semantic search. In addition, it wraps existing efficient and widely-used implementations of certain frameworks and metrics, such as sacreBLEU and ROUGE. Overall, the library aims to provide extensive coverage and increased flexibility in comparison to existing libraries for strings. It can be used for many downstream applications, tasks, and problems in natural-language processing, bioinformatics, and computational social sciences. It is implemented in Python, easily installable via pip, and accessible through a simple API. Source code, documentation, and tutorials are all available on our GitHub page: https://github.com/stanfordnlp/string2string* Documentation: https://string2string.readthedocs.io/en/latest/* GitHub page: https://github.com/stanfordnlp/string2string* Short video: https://drive.google.com/file/d/1IT-pBACDVUoEHewk__5Pz5mU5oAMq5k_/view?usp=sharing

2021

pdf bib abs
Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models
Matthew Finlayson | Aaron Mueller | Sebastian Gehrmann | Stuart Shieber | Tal Linzen | Yonatan Belinkov
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts. To elucidate the mechanisms by which the models accomplish this behavior, this study applies causal mediation analysis to pre-trained neural language models. We investigate the magnitude of models’ preferences for grammatical inflections, as well as whether neurons process subject-verb agreement similarly across sentences with different syntactic structures. We uncover similarities and differences across architectures and model sizes—notably, that larger models do not necessarily learn stronger preferences. We also observe two distinct mechanisms for producing subject-verb agreement depending on the syntactic structure of the input sentence. Finally, we find that language models rely on similar sets of neurons when given sentences with similar syntactic structure.

2020

pdf bib abs
Linguistic Features for Readability Assessment
Tovly Deutsch | Masoud Jasbi | Stuart Shieber
Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications

Readability assessment aims to automatically classify text by the level appropriate for learning readers. Traditional approaches to this task utilize a variety of linguistically motivated features paired with simple machine learning models. More recent methods have improved performance by discarding these features and utilizing deep learning models. However, it is unknown whether augmenting deep learning models with linguistically motivated features would improve performance further. This paper combines these two approaches with the goal of improving overall model performance and addressing this question. Evaluating on two large readability corpora, we find that, given sufficient training data, augmenting deep learning models with linguistically motivated features does not improve state-of-the-art performance. Our results provide preliminary evidence for the hypothesis that the state-of-the-art deep learning models represent linguistic features of the text related to readability. Future research on the nature of representations formed in these models can shed light on the learned features and their relations to linguistically motivated ones hypothesized in traditional approaches.

pdf bib abs
Probing Neural Dialog Models for Conversational Understanding
Abdelrhman Saleh | Tovly Deutsch | Stephen Casper | Yonatan Belinkov | Stuart Shieber
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI

The predominant approach to open-domain dialog generation relies on end-to-end training of neural models on chat datasets. However, this approach provides little insight as to what these models learn (or do not learn) about engaging in dialog. In this study, we analyze the internal representations learned by neural open-domain dialog systems and evaluate the quality of these representations for learning basic conversational skills. Our results suggest that standard open-domain dialog systems struggle with answering questions, inferring contradiction, and determining the topic of conversation, among other tasks. We also find that the dyadic, turn-taking nature of dialog is not fully leveraged by these models. By exploring these limitations, we highlight the need for additional research into architectures and training methods that can better capture high-level information about dialog.

2019

pdf bib abs
Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference
Yonatan Belinkov | Adam Poliak | Stuart Shieber | Benjamin Van Durme | Alexander Rush
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Natural Language Inference (NLI) datasets often contain hypothesis-only biases—artifacts that allow models to achieve non-trivial performance without learning whether a premise entails a hypothesis. We propose two probabilistic methods to build models that are more robust to such biases and better transfer across datasets. In contrast to standard approaches to NLI, our methods predict the probability of a premise given a hypothesis and NLI label, discouraging models from ignoring the premise. We evaluate our methods on synthetic and existing NLI datasets by training on datasets containing biases and testing on datasets containing no (or different) hypothesis-only biases. Our results indicate that these methods can make NLI models more robust to dataset-specific artifacts, transferring better than a baseline architecture in 9 out of 12 NLI datasets. Additionally, we provide an extensive analysis of the interplay of our methods with known biases in NLI datasets, as well as the effects of encouraging models to ignore biases and fine-tuning on target datasets.

pdf bib abs
On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference
Yonatan Belinkov | Adam Poliak | Stuart Shieber | Benjamin Van Durme | Alexander Rush
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)

Popular Natural Language Inference (NLI) datasets have been shown to be tainted by hypothesis-only biases. Adversarial learning may help models ignore sensitive biases and spurious correlations in data. We evaluate whether adversarial learning can be used in NLI to encourage models to learn representations free of hypothesis-only biases. Our analyses indicate that the representations learned via adversarial learning may be less biased, with only small drops in NLI accuracy.

pdf bib
On Evaluating the Generalization of LSTM Models in Formal Languages
Mirac Suzgun | Yonatan Belinkov | Stuart M. Shieber
Proceedings of the Society for Computation in Linguistics (SCiL) 2019

pdf bib abs
LSTM Networks Can Perform Dynamic Counting
Mirac Suzgun | Yonatan Belinkov | Stuart Shieber | Sebastian Gehrmann
Proceedings of the Workshop on Deep Learning and Formal Languages: Building Bridges

In this paper, we systematically assess the ability of standard recurrent networks to perform dynamic counting and to encode hierarchical representations. All the neural models in our experiments are designed to be small-sized networks both to prevent them from memorizing the training sets and to visualize and interpret their behaviour at test time. Our results demonstrate that the Long Short-Term Memory (LSTM) networks can learn to recognize the well-balanced parenthesis language (Dyck-1) and the shuffles of multiple Dyck-1 languages, each defined over different parenthesis-pairs, by emulating simple real-time k-counter machines. To the best of our knowledge, this work is the first study to introduce the shuffle languages to analyze the computational power of neural networks. We also show that a single-layer LSTM with only one hidden unit is practically sufficient for recognizing the Dyck-1 language. However, none of our recurrent networks was able to yield a good performance on the Dyck-2 language learning task, which requires a model to have a stack-like mechanism for recognition.

2018

pdf bib abs
Learning Neural Templates for Text Generation
Sam Wiseman | Stuart Shieber | Alexander Rush
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

While neural, encoder-decoder models have had significant empirical success in text generation, there remain several unaddressed problems with this style of generation. Encoder-decoder models are largely (a) uninterpretable, and (b) difficult to control in terms of their phrasing or content. This work proposes a neural generation system using a hidden semi-markov model (HSMM) decoder, which learns latent, discrete templates jointly with learning to generate. We show that this model learns useful templates, and that these templates make generation both more interpretable and controllable. Furthermore, we show that this approach scales to real data sets and achieves strong performance nearing that of encoder-decoder text generation models.

2017

pdf bib abs
Challenges in Data-to-Document Generation
Sam Wiseman | Stuart Shieber | Alexander Rush
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Recent neural models have shown significant progress on the problem of generating short descriptive texts conditioned on a small number of database records. In this work, we suggest a slightly more difficult data-to-text generation task, and investigate how effective current approaches are on this task. In particular, we introduce a new, large-scale corpus of data records paired with descriptive documents, propose a series of extractive evaluation methods for analyzing performance, and obtain baseline results using current neural generation methods. Experiments show that these models produce fluent text, but fail to convincingly approximate human-generated documents. Moreover, even templated baselines exceed the performance of these neural models on some metrics, though copy- and reconstruction-based extensions lead to noticeable improvements.

pdf bib abs
Adapting Sequence Models for Sentence Correction
Allen Schmaltz | Yoon Kim | Alexander Rush | Stuart Shieber
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

In a controlled experiment of sequence-to-sequence approaches for the task of sentence correction, we find that character-based models are generally more effective than word-based models and models that encode subword information via convolutions, and that modeling the output data as a series of diffs improves effectiveness over standard approaches. Our strongest sequence-to-sequence model improves over our strongest phrase-based statistical machine translation model, with access to the same data, by 6 M2 (0.5 GLEU) points. Additionally, in the data environment of the standard CoNLL-2014 setup, we demonstrate that modeling (and tuning against) diffs yields similar or better M2 scores with simpler models and/or significantly less data than previous sequence-to-sequence approaches.

pdf bib
Reflexives and Reciprocals in Synchronous Tree Adjoining Grammar
Cristina Aggazzotti | Stuart M. Shieber
Proceedings of the 13th International Workshop on Tree Adjoining Grammars and Related Formalisms

A major challenge in the field of automatic recognition of emotion and affect in speech is the subjective nature of affect labels. The most common approach to acquiring affect labels is to ask a panel of listeners to rate a corpus of spoken utterances along one or more dimensions of interest. For applications ranging from educational technology to voice search to dictation, a speaker’s level of certainty is a primary dimension of interest. In such applications, we would like to know the speaker’s actual level of certainty, but past research has only revealed listeners’ perception of the speaker’s level of certainty. In this paper, we present a method for eliciting spoken utterances using stimuli that we design such that they have a quantitative, crowdsourced legibility score. While we cannot control a speaker’s actual internal level of certainty, the use of these stimuli provides a better estimate of internal certainty compared to existing speech corpora. The Harvard Uncertainty Speech Corpus, containing speech data, certainty annotations, and prosodic features, is made available to the research community.

2013

pdf bib
A Context Free TAG Variant
Ben Swanson | Elif Yamangil | Eugene Charniak | Stuart Shieber
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Nonparametric Bayesian Inference and Efficient Parsing for Tree-adjoining Grammars
Elif Yamangil | Stuart M. Shieber
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Estimating Compact Yet Rich Tree Insertion Grammars
Elif Yamangil | Stuart Shieber
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2010

pdf bib
Complexity, Parsing, and Factorization of Tree-Local Multi-Component Tree-Adjoining Grammar
Rebecca Nesson | Giorgio Satta | Stuart M. Shieber
Computational Linguistics, Volume 36, Issue 3 - September 2010

pdf bib
Bayesian Synchronous Tree-Substitution Grammar Induction and Its Application to Sentence Compression
Elif Yamangil | Stuart M. Shieber
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
Efficiently Parsable Extensions to Tree-Local Multicomponent TAG
Rebecca Nesson | Stuart Shieber
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
The Importance of Sub-Utterance Prosody in Predicting Level of Certainty
Heather Pon-Barry | Stuart Shieber
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

2008

pdf bib
Optimal k-arization of Synchronous Tree-Adjoining Grammar
Rebecca Nesson | Giorgio Satta | Stuart M. Shieber
Proceedings of ACL-08: HLT

pdf bib
Synchronous Vector TAG for Syntax and Semantics: Control Verbs, Relative Clauses, and Inverse Linking
Rebecca Nesson | Stuart Shieber
Proceedings of the Ninth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+9)

2007

pdf bib
Extraction Phenomena in Synchronous TAG Syntax and Semantics
Rebecca Nesson | Stuart M. Shieber
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation

pdf bib
Probabilistic Synchronous Tree-Adjoining Grammars for Machine Translation: The Argument from Bilingual Dictionaries
Stuart M. Shieber
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation

pdf bib
Synchronous Grammars and Transducers: Good News and Bad News
Stuart Shieber
Proceedings of the Tenth International Conference on Parsing Technologies

2006

pdf bib abs
Induction of Probabilistic Synchronous Tree-Insertion Grammars for Machine Translation
Rebecca Nesson | Stuart Shieber | Alexander Rush
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers

The more expressive and flexible a base formalism for machine translation is, the less efficient parsing of it will be. However, even among formalisms with the same parse complexity, some formalisms better realize the desired characteristics for machine translation formalisms than others. We introduce a particular formalism, probabilistic synchronous tree-insertion grammar (PSTIG) that we argue satisfies the desiderata optimally within the class of formalisms that can be parsed no less efficiently than context-free grammars and demonstrate that it outperforms state-of-the-art word-based and phrase-based finite-state translation models on training and test data taken from the EuroParl corpus (Koehn, 2005). We then argue that a higher level of translation quality can be achieved by hybridizing our in- duced model with elementary structures produced using supervised techniques such as those of Groves et al. (2004).

pdf bib
Towards Robust Context-Sensitive Sentence Alignment for Monolingual Corpora
Rani Nelken | Stuart M. Shieber
11th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Unifying Synchronous Tree Adjoining Grammars and Tree Transducers via Bimorphisms
Stuart M. Shieber
11th Conference of the European Chapter of the Association for Computational Linguistics

2005

pdf bib
Arabic Diacritization Using Weighted Finite-State Transducers
Rani Nelken | Stuart M. Shieber
Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages

2004

pdf bib
A learning approach to improving sentence-level MT evaluation
Alex Kulesza | Stuart M. Shieber
Proceedings of the 10th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

pdf bib
Unifying Annotated Discourse Hierarchies to Create a Gold Standard
Marco Carbone | Ya’akov Gal | Stuart Shieber | Barbara Grosz
Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue at HLT-NAACL 2004

pdf bib
Synchronous Grammars as Tree Transducers
Stuart M. Shieber
Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms

2003

pdf bib
Comma Restoration Using Constituency Information
Stuart M. Shieber | Xiaopeng Tao
Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib abs
Partially Ordered Multiset Context-free Grammars and Free-word-order Parsing
Mark-Jan Nederhof | Giorgio Satta | Stuart Shieber
Proceedings of the Eighth International Conference on Parsing Technologies

We present a new formalism, partially ordered multiset context-free grammars (poms-CFG), along with an Earley-style parsing algorithm. The formalism, which can be thought of as a generalization of context-free grammars with partially ordered right-hand sides, is of interest in its own right, and also as infrastructure for obtaining tighter complexity bounds for more expressive context-free formalisms intended to express free or multiple word-order, such as ID/LP grammars. We reduce ID/LP grammars to poms-grammars, thereby getting finer-grained bounds on the parsing complexity of ID/LP grammars. We argue that in practice, the width of attested ID/LP grammars is small, yielding effectively polynomial time complexity for ID/LP grammar parsing.