Sida I. Wang

Also published as: Sida Wang


pdf bib
On Continual Model Refinement in Out-of-Distribution Data Streams
Bill Yuchen Lin | Sida Wang | Xi Lin | Robin Jia | Lin Xiao | Xiang Ren | Scott Yih
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Real-world natural language processing (NLP) models need to be continually updated to fix the prediction errors in out-of-distribution (OOD) data streams while overcoming catastrophic forgetting. However, existing continual learning (CL) problem setups cannot cover such a realistic and complex scenario. In response to this, we propose a new CL problem formulation dubbed continual model refinement (CMR). Compared to prior CL settings, CMR is more practical and introduces unique challenges (boundary-agnostic and non-stationary distribution shift, diverse mixtures of multiple OOD data clusters, error-centric streams, etc.). We extend several existing CL approaches to the CMR setting and evaluate them extensively. For benchmarking and analysis, we propose a general sampling algorithm to obtain dynamic OOD data streams with controllable non-stationarity, as well as a suite of metrics measuring various aspects of online performance. Our experiments and detailed analysis reveal the promise and challenges of the CMR problem, supporting that studying CMR in dynamic OOD streams can benefit the longevity of deployed NLP models in production.


pdf bib
Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment
Haoyue Shi | Luke Zettlemoyer | Sida I. Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Bilingual lexicons map words in one language to their translations in another, and are typically induced by learning linear projections to align monolingual word embedding spaces. In this paper, we show it is possible to produce much higher quality lexicons with methods that combine (1) unsupervised bitext mining and (2) unsupervised word alignment. Directly applying a pipeline that uses recent algorithms for both subproblems significantly improves induced lexicon quality and further gains are possible by learning to filter the resulting lexical entries, with both unsupervised and semi-supervised schemes. Our final model outperforms the state of the art on the BUCC 2020 shared task by 14 F1 points averaged over 12 language pairs, while also providing a more interpretable approach that allows for rich reasoning of word meaning in context. Further analysis of our output and the standard reference lexicons suggests they are of comparable quality, and new benchmarks may be needed to measure further progress on this task.


pdf bib
Interactive Classification by Asking Informative Questions
Lili Yu | Howard Chen | Sida I. Wang | Tao Lei | Yoav Artzi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We study the potential for interaction in natural language classification. We add a limited form of interaction for intent classification, where users provide an initial query using natural language, and the system asks for additional information using binary or multi-choice questions. At each turn, our system decides between asking the most informative question or making the final classification pre-diction. The simplicity of the model allows for bootstrapping of the system without interaction data, instead relying on simple crowd-sourcing tasks. We evaluate our approach on two domains, showing the benefit of interaction and the advantage of learning to balance between asking additional questions and making the final prediction.

pdf bib
Grounded Adaptation for Zero-shot Executable Semantic Parsing
Victor Zhong | Mike Lewis | Sida I. Wang | Luke Zettlemoyer
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose Grounded Adaptation for Zeroshot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e.g. new database schemas). GAZP combines a forward semantic parser with a backward utterance generator to synthesize data (e.g. utterances and SQL queries) in the new environment, then selects cycle-consistent examples to adapt the parser. Unlike data-augmentation, which typically synthesizes unverified examples in the training environment, GAZP synthesizes examples in the new environment whose input-output consistency are verified through execution. On the Spider, Sparc, and CoSQL zero-shot semantic parsing tasks, GAZP improves logical form and execution accuracy of the baseline parser. Our analyses show that GAZP outperforms data-augmentation in the training environment, performance increases with the amount of GAZP-synthesized data, and cycle-consistency is central to successful adaptation.


pdf bib
Simple Recurrent Units for Highly Parallelizable Recurrence
Tao Lei | Yu Zhang | Sida I. Wang | Hui Dai | Yoav Artzi
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU), a light recurrent unit that balances model capacity and scalability. SRU is designed to provide expressive recurrence, enable highly parallelized implementation, and comes with careful initialization to facilitate training of deep models. We demonstrate the effectiveness of SRU on multiple NLP tasks. SRU achieves 5—9x speed-up over cuDNN-optimized LSTM on classification and question answering datasets, and delivers stronger results than LSTM and convolutional models. We also obtain an average of 0.7 BLEU improvement over the Transformer model (Vaswani et al., 2017) on translation by incorporating SRU into the architecture.


pdf bib
Naturalizing a Programming Language via Interactive Learning
Sida I. Wang | Samuel Ginn | Percy Liang | Christopher D. Manning
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Our goal is to create a convenient natural language interface for performing well-specified but complex actions such as analyzing data, manipulating text, and querying databases. However, existing natural language interfaces for such tasks are quite primitive compared to the power one wields with a programming language. To bridge this gap, we start with a core programming language and allow users to “naturalize” the core language incrementally by defining alternative, more natural syntax and increasingly complex concepts in terms of compositions of simpler ones. In a voxel world, we show that a community of users can simultaneously teach a common system a diverse language and use it to build hundreds of complex voxel structures. Over the course of three days, these users went from using only the core language to using the naturalized language in 85.9% of the last 10K utterances.


pdf bib
Learning Language Games through Interaction
Sida I. Wang | Percy Liang | Christopher D. Manning
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)


pdf bib
Human Effort and Machine Learnability in Computer Aided Translation
Spence Green | Sida I. Wang | Jason Chuang | Jeffrey Heer | Sebastian Schuster | Christopher D. Manning
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)


pdf bib
Feature Noising for Log-Linear Structured Prediction
Sida Wang | Mengqiu Wang | Stefan Wager | Percy Liang | Christopher D. Manning
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Fast and Adaptive Online Training of Feature-Rich Translation Models
Spence Green | Sida Wang | Daniel Cer | Christopher D. Manning
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Feature-Rich Phrase-based Translation: Stanford University’s Submission to the WMT 2013 Translation Task
Spence Green | Daniel Cer | Kevin Reschke | Rob Voigt | John Bauer | Sida Wang | Natalia Silveira | Julia Neidert | Christopher D. Manning
Proceedings of the Eighth Workshop on Statistical Machine Translation


pdf bib
Baselines and Bigrams: Simple, Good Sentiment and Topic Classification
Sida Wang | Christopher Manning
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)