Matthew Stone


2021

pdf bib
COSMic: A Coherence-Aware Generation Metric for Image Descriptions
Mert Inan | Piyush Sharma | Baber Khalid | Radu Soricut | Matthew Stone | Malihe Alikhani
Findings of the Association for Computational Linguistics: EMNLP 2021

Developers of text generation models rely on automated evaluation metrics as a stand-in for slow and expensive manual evaluations. However, image captioning metrics have struggled to give accurate learned estimates of the semantic and pragmatic success of output text. We address this weakness by introducing the first discourse-aware learned generation metric for evaluating image descriptions. Our approach is inspired by computational theories of discourse for capturing information goals using coherence. We present a dataset of image–description pairs annotated with coherence relations. We then train a coherence-aware metric on a subset of the Conceptual Captions dataset and measure its effectiveness—its ability to predict human ratings of output captions—on a test set composed of out-of-domain images. We demonstrate a higher Kendall Correlation Coefficient for our proposed metric with the human judgments for the results of a number of state-of-the-art coherence-aware caption generation models when compared to several other metrics including recently proposed learned metrics such as BLEURT and BERTScore.

2020

pdf bib
Cross-modal Coherence Modeling for Caption Generation
Malihe Alikhani | Piyush Sharma | Shengjie Li | Radu Soricut | Matthew Stone
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We use coherence relations inspired by computational models of discourse to study the information needs and goals of image captioning. Using an annotation protocol specifically devised for capturing image–caption coherence relations, we annotate 10,000 instances from publicly-available image–caption pairs. We introduce a new task for learning inferences in imagery and text, coherence relation prediction, and show that these coherence annotations can be exploited to learn relation classifiers as an intermediary step, and also train coherence-aware, controllable image captioning models. The results show a dramatic improvement in the consistency and quality of the generated captions with respect to information needs specified via coherence relations.

pdf bib
Achieving Common Ground in Multi-modal Dialogue
Malihe Alikhani | Matthew Stone
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

All communication aims at achieving common ground (grounding): interlocutors can work together effectively only with mutual beliefs about what the state of the world is, about what their goals are, and about how they plan to make their goals a reality. Computational dialogue research offers some classic results on grouding, which unfortunately offer scant guidance to the design of grounding modules and behaviors in cutting-edge systems. In this tutorial, we focus on three main topic areas: 1) grounding in human-human communication; 2) grounding in dialogue systems; and 3) grounding in multi-modal interactive systems, including image-oriented conversations and human-robot interactions. We highlight a number of achievements of recent computational research in coordinating complex content, show how these results lead to rich and challenging opportunities for doing grounding in more flexible and powerful ways, and canvass relevant insights from the literature on human–human conversation. We expect that the tutorial will be of interest to researchers in dialogue systems, computational semantics and cognitive modeling, and hope that it will catalyze research and system building that more directly explores the creative, strategic ways conversational agents might be able to seek and offer evidence about their understanding of their interlocutors.

pdf bib
Combining Cognitive Modeling and Reinforcement Learning for Clarification in Dialogue
Baber Khalid | Malihe Alikhani | Matthew Stone
Proceedings of the 28th International Conference on Computational Linguistics

In many domains, dialogue systems need to work collaboratively with users to successfully reconstruct the meaning the user had in mind. In this paper, we show how cognitive models of users’ communicative strategies can be leveraged in a reinforcement learning approach to dialogue planning to enable interactive systems to give targeted, effective feedback about the system’s understanding. We describe a prototype system that collaborates on reference tasks that distinguish arbitrarily varying color patches from similar distractors, and use experiments with crowd workers and analyses of our learned policies to document that our approach leads to context-sensitive clarification strategies that focus on key missing information, elicit correct answers that the system understands, and contribute to increasing dialogue success.

pdf bib
Aspectuality Across Genre: A Distributional Semantics Approach
Thomas Kober | Malihe Alikhani | Matthew Stone | Mark Steedman
Proceedings of the 28th International Conference on Computational Linguistics

The interpretation of the lexical aspect of verbs in English plays a crucial role in tasks such as recognizing textual entailment and learning discourse-level inferences. We show that two elementary dimensions of aspectual class, states vs. events, and telic vs. atelic events, can be modelled effectively with distributional semantics. We find that a verb’s local context is most indicative of its aspectual class, and we demonstrate that closed class words tend to be stronger discriminating contexts than content words. Our approach outperforms previous work on three datasets. Further, we present a new dataset of human-human conversations annotated with lexical aspects and present experiments that show the correlation of telicity with genre and discourse goals.

pdf bib
Analyzing Speaker Strategy in Referential Communication
Brian McMahan | Matthew Stone
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue

We analyze a corpus of referential communication through the lens of quantitative models of speaker reasoning. Different models place different emphases on linguistic reasoning and collaborative reasoning. This leads models to make different assessments of the risks and rewards of using specific utterances in specific contexts. By fitting a latent variable model to the corpus, we can exhibit utterances that give systematic evidence of the diverse kinds of reasoning speakers employ, and build integrated models that recognize not only speaker reference but also speaker reasoning.

2019

pdf bib
“Caption” as a Coherence Relation: Evidence and Implications
Malihe Alikhani | Matthew Stone
Proceedings of the Second Workshop on Shortcomings in Vision and Language

We study verbs in image–text corpora, contrasting caption corpora, where texts are explicitly written to characterize image content, with depiction corpora, where texts and images may stand in more general relations. Captions show a distinctively limited distribution of verbs, with strong preferences for specific tense, aspect, lexical aspect, and semantic field. These limitations, which appear in data elicited by a range of methods, restrict the utility of caption corpora to inform image retrieval, multimodal document generation, and perceptually-grounded semantic models. We suggest that these limitations reflect the discourse constraints in play when subjects write texts to accompany imagery, so we argue that future development of image–text corpora should work to increase the diversity of event descriptions, while looking explicitly at the different ways text and imagery can be coherently related.

pdf bib
CITE: A Corpus of Image-Text Discourse Relations
Malihe Alikhani | Sreyasi Nag Chowdhury | Gerard de Melo | Matthew Stone
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

This paper presents a novel crowd-sourced resource for multimodal discourse: our resource characterizes inferences in image-text contexts in the domain of cooking recipes in the form of coherence relations. Like previous corpora annotating discourse structure between text arguments, such as the Penn Discourse Treebank, our new corpus aids in establishing a better understanding of natural communication and common-sense reasoning, while our findings have implications for a wide range of applications, such as understanding and generation of multimodal documents.

2018

pdf bib
Arrows are the Verbs of Diagrams
Malihe Alikhani | Matthew Stone
Proceedings of the 27th International Conference on Computational Linguistics

Arrows are a key ingredient of schematic pictorial communication. This paper investigates the interpretation of arrows through linguistic, crowdsourcing and machine-learning methodology. Our work establishes a novel analogy between arrows and verbs: we advocate representing arrows in terms of qualitatively different structural and semantic frames, and resolving frames to specific interpretations using shallow world knowledge.

2016

pdf bib
Syntactic realization with data-driven neural tree grammars
Brian McMahan | Matthew Stone
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

A key component in surface realization in natural language generation is to choose concrete syntactic relationships to express a target meaning. We develop a new method for syntactic choice based on learning a stochastic tree grammar in a neural architecture. This framework can exploit state-of-the-art methods for modeling word sequences and generalizing across vocabulary. We also induce embeddings to generalize over elementary tree structures and exploit a tree recurrence over the input structure to model long-distance influences between NLG choices. We evaluate the models on the task of linearizing unannotated dependency trees, documenting the contribution of our modeling techniques to improvements in both accuracy and run time.

2015

pdf bib
A Bayesian Model of Grounded Color Semantics
Brian McMahan | Matthew Stone
Transactions of the Association for Computational Linguistics, Volume 3

Natural language meanings allow speakers to encode important real-world distinctions, but corpora of grounded language use also reveal that speakers categorize the world in different ways and describe situations with different terminology. To learn meanings from data, we therefore need to link underlying representations of meaning to models of speaker judgment and speaker choice. This paper describes a new approach to this problem: we model variability through uncertainty in categorization boundaries and distributions over preferred vocabulary. We apply the approach to a large data set of color descriptions, where statistical evaluation documents its accuracy. The results are available as a Lexicon of Uncertain Color Standards (LUX), which supports future efforts in grounded language understanding and generation by probabilistically mapping 829 English color descriptions to potentially context-sensitive regions in HSV color space.

pdf bib
Proceedings of the 11th International Conference on Computational Semantics
Matthew Purver | Mehrnoosh Sadrzadeh | Matthew Stone
Proceedings of the 11th International Conference on Computational Semantics

2014

pdf bib
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)
Kallirroi Georgila | Matthew Stone | Helen Hastie | Ani Nenkova
Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL)

2013

pdf bib
Situated Utterances and Discourse Relations
Matthew Stone | Una Stojnic | Ernest Lepore
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Short Papers

pdf bib
Training an integrated sentence planner on user dialogue
Brian McMahan | Matthew Stone
Proceedings of the SIGDIAL 2013 Conference

2012

pdf bib
Towards a Flexible Semantics: Colour Terms in Collaborative Reference Tasks
Bert Baumgaertner | Raquel Fernández | Matthew Stone
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

2009

pdf bib
Learning to Interpret Utterances Using Dialogue History
David DeVault | Matthew Stone
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2008

pdf bib
Support Collaboration by Teaching Fundamentals
Matthew Stone
Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics

pdf bib
Language, Embodiment and Social Intelligence
Matthew Stone
Proceedings of the Fifth International Natural Language Generation Conference

2007

pdf bib
Sentence generation as a planning problem
Alexander Koller | Matthew Stone
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference
Candace Sidner | Tanja Schultz | Matthew Stone | ChengXiang Zhai
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Candace Sidner | Tanja Schultz | Matthew Stone | ChengXiang Zhai
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

2005

pdf bib
Teaching Dialogue to Interdisciplinary Teams through Toolkits
Justine Cassell | Matthew Stone
Proceedings of the Second ACL Workshop on Effective Tools and Methodologies for Teaching NLP and CL

pdf bib
An Information-State Approach to Collaborative Reference
David DeVault | Natalia Kariaeva | Anubha Kothari | Iris Oved | Matthew Stone
Proceedings of the ACL Interactive Poster and Demonstration Sessions

2004

pdf bib
Interpreting Vague Utterances in Context
David DeVault | Matthew Stone
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms
Owen Rambow | Matthew Stone
Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms

2003

pdf bib
Anaphora and Discourse Structure
Bonnie Webber | Matthew Stone | Aravind Joshi | Alistair Knott
Computational Linguistics, Volume 29, Number 4, December 2003

2002

pdf bib
Lexicalized Grammar 101
Matthew Stone
Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics

2000

pdf bib
On identifying sets
Matthew Stone
INLG’2000 Proceedings of the First International Conference on Natural Language Generation

pdf bib
Coordination and context-dependence in the generation of embodied conversation
Justine Cassell | Matthew Stone | Hao Yan
INLG’2000 Proceedings of the First International Conference on Natural Language Generation

pdf bib
Lexicalized grammar and the description of motion events
Matthew Stone | Tonia Bleam | Christine Doran | Martha Palmer
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

1999

pdf bib
Discourse Relations: A Structural and Presuppositional Account Using Lexicalised TAG
Bonnie Webber | Alistair Knott | Matthew Stone | Aravind Joshi
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics

1998

pdf bib
Textual Economy Through Close Coupling of Syntax and Semantics
Matthew Stone | Bonnie Webber
Natural Language Generation

1997

pdf bib
Sentence Planning as Description Using Tree Adjoining Grammar
Matthew Stone | Christine Doran
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics

1996

pdf bib
Paying Heed to Collocations
Matthew Stone | Christine Doran
Eighth International Natural Language Generation Workshop