Su Wang


pdf bib
On the Evaluation of Vision-and-Language Navigation Instructions
Ming Zhao | Peter Anderson | Vihan Jain | Su Wang | Alexander Ku | Jason Baldridge | Eugene Ie
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Vision-and-Language Navigation wayfinding agents can be enhanced by exploiting automatically generated navigation instructions. However, existing instruction generators have not been comprehensively evaluated, and the automatic evaluation metrics used to develop them have not been validated. Using human wayfinders, we show that these generators perform on par with or only slightly better than a template-based generator and far worse than human instructors. Furthermore, we discover that BLEU, ROUGE, METEOR and CIDEr are ineffective for evaluating grounded navigation instructions. To improve instruction evaluation, we propose an instruction-trajectory compatibility model that operates without reference instructions. Our model shows the highest correlation with human wayfinding outcomes when scoring individual instructions. For ranking instruction generation systems, if reference instructions are available we recommend using SPICE.


pdf bib
Query-focused Scenario Construction
Su Wang | Greg Durrett | Katrin Erk
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The news coverage of events often contains not one but multiple incompatible accounts of what happened. We develop a query-based system that extracts compatible sets of events (scenarios) from such data, formulated as one-class clustering. Our system incrementally evaluates each event’s compatibility with already selected events, taking order into account. We use synthetic data consisting of article mixtures for scalable training and evaluate our model on a new human-curated dataset of scenarios about real-world news topics. Stronger neural network models and harder synthetic training settings are both important to achieve high performance, and our final scenario construction system substantially outperforms baselines based on prior work.


pdf bib
Picking Apart Story Salads
Su Wang | Eric Holgate | Greg Durrett | Katrin Erk
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

During natural disasters and conflicts, information about what happened is often confusing and messy, and distributed across many sources. We would like to be able to automatically identify relevant information and assemble it into coherent narratives of what happened. To make this task accessible to neural models, we introduce Story Salads, mixtures of multiple documents that can be generated at scale. By exploiting the Wikipedia hierarchy, we can generate salads that exhibit challenging inference problems. Story salads give rise to a novel, challenging clustering task, where the objective is to group sentences from the same narratives. We demonstrate that simple bag-of-words similarity clustering falls short on this task, and that it is necessary to take into account global context and coherence.

pdf bib
Modeling Semantic Plausibility by Injecting World Knowledge
Su Wang | Greg Durrett | Katrin Erk
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

Distributional data tells us that a man can swallow candy, but not that a man can swallow a paintball, since this is never attested. However both are physically plausible events. This paper introduces the task of semantic plausibility: recognizing plausible but possibly novel events. We present a new crowdsourced dataset of semantic plausibility judgments of single events such as man swallow paintball. Simple models based on distributional representations perform poorly on this task, despite doing well on selection preference, but injecting manually elicited knowledge about entity properties provides a substantial performance boost. Our error analysis shows that our new dataset is a great testbed for semantic plausibility models: more sophisticated knowledge representation and propagation could address many of the remaining errors.


pdf bib
Distributional Modeling on a Diet: One-shot Word Learning from Text Only
Su Wang | Stephen Roller | Katrin Erk
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

We test whether distributional models can do one-shot learning of definitional properties from text only. Using Bayesian models, we find that first learning overarching structure in the known data, regularities in textual contexts and in properties, helps one-shot learning, and that individual context items can be highly informative.

pdf bib
Leveraging Discourse Information Effectively for Authorship Attribution
Elisa Ferracane | Su Wang | Raymond Mooney
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

We explore techniques to maximize the effectiveness of discourse information in the task of authorship attribution. We present a novel method to embed discourse features in a Convolutional Neural Network text classifier, which achieves a state-of-the-art result by a significant margin. We empirically investigate several featurization methods to understand the conditions under which discourse features contribute non-trivial performance gains, and analyze discourse embeddings.