Sarah Taylor

Also published as: Sarah M. Taylor


2015

pdf bib
Understanding Cultural Conflicts using Metaphors and Sociolinguistic Measures of Influence
Samira Shaikh | Tomek Strzalkowski | Sarah Taylor | John Lien | Ting Liu | George Aaron Broadwell | Laurie Feldman | Boris Yamrom | Kit Cho | Yuliya Peshkova
Proceedings of the Third Workshop on Metaphor in NLP

2014

pdf bib
Automatic Expansion of the MRC Psycholinguistic Database Imageability Ratings
Ting Liu | Kit Cho | G. Aaron Broadwell | Samira Shaikh | Tomek Strzalkowski | John Lien | Sarah Taylor | Laurie Feldman | Boris Yamrom | Nick Webb | Umit Boz | Ignacio Cases | Ching-sheng Lin
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Recent studies in metaphor extraction across several languages (Broadwell et al., 2013; Strzalkowski et al., 2013) have shown that word imageability ratings are highly correlated with the presence of metaphors in text. Information about imageability of words can be obtained from the MRC Psycholinguistic Database (MRCPD) for English words and Léxico Informatizado del Español Programa (LEXESP) for Spanish words, which is a collection of human ratings obtained in a series of controlled surveys. Unfortunately, word imageability ratings were collected for only a limited number of words: 9,240 words in English, 6,233 in Spanish; and are unavailable at all in the other two languages studied: Russian and Farsi. The present study describes an automated method for expanding the MRCPD by conferring imageability ratings over the synonyms and hyponyms of existing MRCPD words, as identified in Wordnet. The result is an expanded MRCPD+ database with imagea-bility scores for more than 100,000 words. The appropriateness of this expansion process is assessed by examining the structural coherence of the expanded set and by validating the expanded lexicon against human judgment. Finally, the performance of the metaphor extraction system is shown to improve significantly with the expanded database. This paper describes the process for English MRCPD+ and the resulting lexical resource. The process is analogous for other languages.

pdf bib
A Multi-Cultural Repository of Automatically Discovered Linguistic and Conceptual Metaphors
Samira Shaikh | Tomek Strzalkowski | Ting Liu | George Aaron Broadwell | Boris Yamrom | Sarah Taylor | Laurie Feldman | Kit Cho | Umit Boz | Ignacio Cases | Yuliya Peshkova | Ching-Sheng Lin
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this article, we present details about our ongoing work towards building a repository of Linguistic and Conceptual Metaphors. This resource is being developed as part of our research effort into the large-scale detection of metaphors from unrestricted text. We have stored a large amount of automatically extracted metaphors in American English, Mexican Spanish, Russian and Iranian Farsi in a relational database, along with pertinent metadata associated with these metaphors. A substantial subset of the contents of our repository has been systematically validated via rigorous social science experiments. Using information stored in the repository, we are able to posit certain claims in a cross-cultural context about how peoples in these cultures (America, Mexico, Russia and Iran) view particular concepts related to Governance and Economic Inequality through the use of metaphor. Researchers in the field can use this resource as a reference of typical metaphors used across these cultures. In addition, it can be used to recognize metaphors of the same form or pattern, in other domains of research.

pdf bib
Computing Affect in Metaphors
Tomek Strzalkowski | Samira Shaikh | Kit Cho | George Aaron Broadwell | Laurie Feldman | Sarah Taylor | Boris Yamrom | Ting Liu | Ignacio Cases | Yuliya Peshkova | Kyle Elliot
Proceedings of the Second Workshop on Metaphor in NLP

pdf bib
Discovering Conceptual Metaphors using Source Domain Spaces
Samira Shaikh | Tomek Strzalkowski | Kit Cho | Ting Liu | George Aaron Broadwell | Laurie Feldman | Sarah Taylor | Boris Yamrom | Ching-Sheng Lin | Ning Sa | Ignacio Cases | Yuliya Peshkova | Kyle Elliot
Proceedings of the 4th Workshop on Cognitive Aspects of the Lexicon (CogALex)

2013

pdf bib
Robust Extraction of Metaphor from Novel Data
Tomek Strzalkowski | George Aaron Broadwell | Sarah Taylor | Laurie Feldman | Samira Shaikh | Ting Liu | Boris Yamrom | Kit Cho | Umit Boz | Ignacio Cases | Kyle Elliot
Proceedings of the First Workshop on Metaphor in NLP

2012

pdf bib
Modeling Leadership and Influence in Multi-party Online Discourse
Tomek Strzalkowski | Samira Shaikh | Ting Liu | George Aaron Broadwell | Jenny Stromer-Galley | Sarah Taylor | Umit Boz | Veena Ravishankar | Xiaoai Ren
Proceedings of COLING 2012

pdf bib
Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language
Ting Liu | Samira Shaikh | Tomek Strzalkowski | Aaron Broadwell | Jennifer Stromer-Galley | Sarah Taylor | Umit Boz | Xiaoai Ren | Jingsi Wu
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper, we report our efforts in building a multi-lingual multi-party online chat corpus in order to develop a firm understanding in a set of social constructs such as agenda control, influence, and leadership as well as to computationally model such constructs in online interactions. These automated models will help capture the dialogue dynamics that are essential for developing, among others, realistic human-machine dialogue systems, including autonomous virtual chat agents. In this paper, we first introduce our experiment design and data collection method in Chinese and Urdu, and then report on the current stage of our data collection. We annotated the collected corpus on four levels: communication links, dialogue acts, local topics, and meso-topics. Results from the analyses of annotated data on different languages indicate some interesting phenomena, which are reported in this paper.

2010

pdf bib
VCA: An Experiment with a Multiparty Virtual Chat Agent
Samira Shaikh | Tomek Strzalkowski | Sarah Taylor | Nick Webb
Proceedings of the 2010 Workshop on Companionable Dialogue Systems

pdf bib
Enhancing Cross Document Coreference of Web Documents with Context Similarity and Very Large Scale Text Categorization
Jian Huang | Pucktada Treeratpituk | Sarah Taylor | C. Lee Giles
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Modeling Socio-Cultural Phenomena in Discourse
Tomek Strzalkowski | George Aaron Broadwell | Jennifer Stromer-Galley | Samira Shaikh | Sarah Taylor | Nick Webb
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
MPC: A Multi-Party Chat Corpus for Modeling Social Phenomena in Discourse
Samira Shaikh | Tomek Strzalkowski | Aaron Broadwell | Jennifer Stromer-Galley | Sarah Taylor | Nick Webb
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we describe our experience with collecting and creating an annotated corpus of multi-party online conversations in a chat-room environment. This effort is part of a larger project to develop computational models of social phenomena such as agenda control, influence, and leadership in on-line interactions. Such models will help capturing the dialogue dynamics that are essential for developing, among others, realistic human-machine dialogue systems, including autonomous virtual chat agents. In this paper we describe data collection method used and the characteristics of the initial dataset of English chat. We have devised a multi-tiered collection process in which the subjects start from simple, free-flowing conversations and progress towards more complex and structured interactions. In this paper, we report on the first two stages of this process, which were recently completed. The third, large-scale collection effort is currently being conducted. All English dialogue has been annotated at four levels: communication links, dialogue acts, local topics and meso-topics. Some details of these annotations will be discussed later in this paper, although a full description is impossible within the scope of this article.

2009

pdf bib
Solving the “Who’s Mark Johnson Puzzle”: Information Extraction Based Cross Document Coreference
Jian Huang | Sarah M. Taylor | Jonathan L. Smith | Konstantinos A. Fotiadis | C. Lee Giles
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium

pdf bib
Profile Based Cross-Document Coreference Using Kernelized Fuzzy Relational Clustering
Jian Huang | Sarah M. Taylor | Jonathan L. Smith | Konstantinos A. Fotiadis | C. Lee Giles
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

1996

pdf bib
Technology Transfer: Observations from the TIPSTER Text Program
Sarah M. Taylor
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996

pdf bib
The NYU TIPSTER II Project
Sarah M. Taylor
TIPSTER TEXT PROGRAM PHASE II: Proceedings of a Workshop held at Vienna, Virginia, May 6-8, 1996