James Pustejovsky

Also published as: J. Pustejovsky, James D. Pustejovsky


2021

pdf bib
Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis
Eben Holderness | Antonio Jimeno Yepes | Alberto Lavelli | Anne-Lyse Minard | James Pustejovsky | Fabio Rinaldi
Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis

pdf bib
Neural Metaphor Detection with Visibility Embeddings
Gitit Kehat | James Pustejovsky
Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics

We present new results for the problem of sequence metaphor labeling, using the recently developed Visibility Embeddings. We show that concatenating such embeddings to the input of a BiLSTM obtains consistent and significant improvements at almost no cost, and we present further improved results when visibility embeddings are combined with BERT.

pdf bib
COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation
Qingyun Wang | Manling Li | Xuan Wang | Nikolaus Parulian | Guangxing Han | Jiawei Ma | Jingxuan Tu | Ying Lin | Ranran Haoran Zhang | Weili Liu | Aabhas Chauhan | Yingjun Guan | Bangzheng Li | Ruisong Li | Xiangchen Song | Yi Fung | Heng Ji | Jiawei Han | Shih-Fu Chang | James Pustejovsky | Jasmine Rah | David Liem | Ahmed ELsayed | Martha Palmer | Clare Voss | Cynthia Schneider | Boyan Onyshkevych
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations

To combat COVID-19, both clinicians and scientists need to digest the vast amount of relevant biomedical knowledge in literature to understand the disease mechanism and the related biological functions. We have developed a novel and comprehensive knowledge discovery framework, COVID-KG to extract fine-grained multimedia knowledge elements (entities, relations and events) from scientific literature. We then exploit the constructed multimedia knowledge graphs (KGs) for question answering and report generation, using drug repurposing as a case study. Our framework also provides detailed contextual sentences, subfigures, and knowledge subgraphs as evidence. All of the data, KGs, reports.

pdf bib
Exploration and Discovery of the COVID-19 Literature through Semantic Visualization
Jingxuan Tu | Marc Verhagen | Brent Cochran | James Pustejovsky
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

We propose semantic visualization as a linguistic visual analytic method. It can enable exploration and discovery over large datasets of complex networks by exploiting the semantics of the relations in them. This involves extracting information, applying parameter reduction operations, building hierarchical data representation and designing visualization. We also present the accompanying COVID-SemViz a searchable and interactive visualization system for knowledge exploration of COVID-19 data to demonstrate the application of our proposed method. In the user studies, users found that semantic visualization-powered COVID-SemViz is helpful in terms of finding relevant information and discovering unknown associations.

2020

pdf bib
A Two-Level Interpretation of Modality in Human-Robot Dialogue
Lucia Donatelli | Kenneth Lai | James Pustejovsky
Proceedings of the 28th International Conference on Computational Linguistics

We analyze the use and interpretation of modal expressions in a corpus of situated human-robot dialogue and ask how to effectively represent these expressions for automatic learning. We present a two-level annotation scheme for modality that captures both content and intent, integrating a logic-based, semantic representation and a task-oriented, pragmatic representation that maps to our robot’s capabilities. Data from our annotation task reveals that the interpretation of modal expressions in human-robot dialogue is quite diverse, yet highly constrained by the physical environment and asymmetrical speaker/addressee relationship. We sketch a formal model of human-robot common ground in which modality can be grounded and dynamically interpreted.

pdf bib
Reproducing Neural Ensemble Classifier for Semantic Relation Extraction inScientific Papers
Kyeongmin Rim | Jingxuan Tu | Kelley Lynch | James Pustejovsky
Proceedings of the 12th Language Resources and Evaluation Conference

Within the natural language processing (NLP) community, shared tasks play an important role. They define a common goal and allowthe the comparison of different methods on the same data. SemEval-2018 Task 7 involves the identification and classification of relationsin abstracts from computational linguistics (CL) publications. In this paper we describe an attempt to reproduce the methods and resultsfrom the top performing system at for SemEval-2018 Task 7. We describe challenges we encountered in the process, report on the resultsof our system, and discuss the ways that our attempt at reproduction can inform best practices.

pdf bib
A Formal Analysis of Multimodal Referring Strategies Under Common Ground
Nikhil Krishnaswamy | James Pustejovsky
Proceedings of the 12th Language Resources and Evaluation Conference

In this paper, we present an analysis of computationally generated mixed-modality definite referring expressions using combinations of gesture and linguistic descriptions. In doing so, we expose some striking formal semantic properties of the interactions between gesture and language, conditioned on the introduction of content into the common ground between the (computational) speaker and (human) viewer, and demonstrate how these formal features can contribute to training better models to predict viewer judgment of referring expressions, and potentially to the generation of more natural and informative referring expressions.

pdf bib
Improving Neural Metaphor Detection with Visual Datasets
Gitit Kehat | James Pustejovsky
Proceedings of the 12th Language Resources and Evaluation Conference

We present new results on Metaphor Detection by using text from visual datasets. Using a straightforward technique for sampling text from Vision-Language datasets, we create a data structure we term a visibility word embedding. We then combine these embeddings in a relatively simple BiLSTM module augmented with contextualized word representations (ELMo), and show improvement over previous state-of-the-art approaches that use more complex neural network architectures and richer linguistic features, for the task of verb classification.

pdf bib
Interchange Formats for Visualization: LIF and MMIF
Kyeongmin Rim | Kelley Lynch | Marc Verhagen | Nancy Ide | James Pustejovsky
Proceedings of the 12th Language Resources and Evaluation Conference

Promoting interoperrable computational linguistics (CL) and natural language processing (NLP) application platforms and interchange-able data formats have contributed improving discoverabilty and accessbility of the openly available NLP software. In this paper, wediscuss the enhanced data visualization capabilities that are also enabled by inter-operating NLP pipelines and interchange formats.For adding openly available visualization tools and graphical annotation tools to the Language Applications Grid (LAPPS Grid) andComputational Linguistics Applications for Multimedia Services (CLAMS) toolboxes, we have developed interchange formats that cancarry annotations and metadata for text and audiovisual source data. We descibe those data formats and present case studies where wesuccessfully adopt open-source visualization tools and combine them with CL tools.

pdf bib
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis
Eben Holderness | Antonio Jimeno Yepes | Alberto Lavelli | Anne-Lyse Minard | James Pustejovsky | Fabio Rinaldi
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis

pdf bib
Proceedings of the Second International Workshop on Designing Meaning Representations
Nianwen Xue | Johan Bos | William Croft | Jan Hajič | Chu-Ren Huang | Stephan Oepen | Martha Palmer | James Pustejovsky
Proceedings of the Second International Workshop on Designing Meaning Representations

pdf bib
A Continuation Semantics for Abstract Meaning Representation
Kenneth Lai | Lucia Donatelli | James Pustejovsky
Proceedings of the Second International Workshop on Designing Meaning Representations

Abstract Meaning Representation (AMR) is a simple, expressive semantic framework whose emphasis on predicate-argument structure is effective for many tasks. Nevertheless, AMR lacks a systematic treatment of projection phenomena, making its translation into logical form problematic. We present a translation function from AMR to first order logic using continuation semantics, which allows us to capture the semantic context of an expression in the form of an argument. This is a natural extension of AMR’s original design principles, allowing us to easily model basic projection phenomena such as quantification and negation as well as complex phenomena such as bound variables and donkey anaphora.

pdf bib
Representation, Learning and Reasoning on Spatial Language for Downstream NLP Tasks
Parisa Kordjamshidi | James Pustejovsky | Marie-Francine Moens
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Understating spatial semantics expressed in natural language can become highly complex in real-world applications. This includes applications of language grounding, navigation, visual question answering, and more generic human-machine interaction and dialogue systems. In many of such downstream tasks, explicit representation of spatial concepts and relationships can improve the capabilities of machine learning models in reasoning and deep language understanding. In this tutorial, we overview the cutting-edge research results and existing challenges related to spatial language understanding including semantic annotations, existing corpora, symbolic and sub-symbolic representations, qualitative spatial reasoning, spatial common sense, deep and structured learning models. We discuss the recent results on the above-mentioned applications –that need spatial language learning and reasoning – and highlight the research gaps and future directions.

pdf bib
AskMe: A LAPPS Grid-based NLP Query and Retrieval System for Covid-19 Literature
Keith Suderman | Nancy Ide | Verhagen Marc | Brent Cochran | James Pustejovsky
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020

In a recent project, the Language Application Grid was augmented to support the mining of scientific publications. The results of that ef- fort have now been repurposed to focus on Covid-19 literature, including modification of the LAPPS Grid “AskMe” query and retrieval engine. We describe the AskMe system and discuss its functionality as compared to other query engines available to search covid-related publications.

2019

pdf bib
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)
Eben Holderness | Antonio Jimeno Yepes | Alberto Lavelli | Anne-Lyse Minard | James Pustejovsky | Fabio Rinaldi
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)

pdf bib
Assessing the Efficacy of Clinical Sentiment Analysis and Topic Extraction in Psychiatric Readmission Risk Prediction
Elena Alvarez-Mellado | Eben Holderness | Nicholas Miller | Fyonn Dhang | Philip Cawkwell | Kirsten Bolton | James Pustejovsky | Mei-Hua Hall
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)

Predicting which patients are more likely to be readmitted to a hospital within 30 days after discharge is a valuable piece of information in clinical decision-making. Building a successful readmission risk classifier based on the content of Electronic Health Records (EHRs) has proved, however, to be a challenging task. Previously explored features include mainly structured information, such as sociodemographic data, comorbidity codes and physiological variables. In this paper we assess incorporating additional clinically interpretable NLP-based features such as topic extraction and clinical sentiment analysis to predict early readmission risk in psychiatry patients.

pdf bib
Generating a Novel Dataset of Multimodal Referring Expressions
Nikhil Krishnaswamy | James Pustejovsky
Proceedings of the 13th International Conference on Computational Semantics - Short Papers

Referring expressions and definite descriptions of objects in space exploit information both about object characteristics and locations. To resolve potential ambiguity, referencing strategies in language can rely on increasingly abstract concepts to distinguish an object in a given location from similar ones elsewhere, yet the description of the intended location may still be imprecise or difficult to interpret. Meanwhile, modalities such as gesture may communicate spatial information such as locations in a more concise manner. In real peer-to-peer communication, humans use language and gesture together to reference entities, with a capacity for mixing and changing modalities where needed. While recent progress in AI and human-computer interaction has created systems where a human can interact with a computer multimodally, computers often lack the capacity to intelligently mix modalities when generating referring expressions. We present a novel dataset of referring expressions combining natural language and gesture, describe its creation and evaluation, and its uses to train computational models for generating and interpreting multimodal referring expressions.

pdf bib
A Dynamic Semantics for Causal Counterfactuals
Kenneth Lai | James Pustejovsky
Proceedings of the 13th International Conference on Computational Semantics - Student Papers

Under the standard approach to counterfactuals, to determine the meaning of a counterfactual sentence, we consider the “closest” possible world(s) where the antecedent is true, and evaluate the consequent. Building on the standard approach, some researchers have found that the set of worlds to be considered is dependent on context; it evolves with the discourse. Others have focused on how to define the “distance” between possible worlds, using ideas from causal modeling. This paper integrates the two ideas. We present a semantics for counterfactuals that uses a distance measure based on causal laws, that can also change over time. We show how our semantics can be implemented in the Haskell programming language.

pdf bib
Distinguishing Clinical Sentiment: The Importance of Domain Adaptation in Psychiatric Patient Health Records
Eben Holderness | Philip Cawkwell | Kirsten Bolton | James Pustejovsky | Mei-Hua Hall
Proceedings of the 2nd Clinical Natural Language Processing Workshop

Recently natural language processing (NLP) tools have been developed to identify and extract salient risk indicators in electronic health records (EHRs). Sentiment analysis, although widely used in non-medical areas for improving decision making, has been studied minimally in the clinical setting. In this study, we undertook, to our knowledge, the first domain adaptation of sentiment analysis to psychiatric EHRs by defining psychiatric clinical sentiment, performing an annotation project, and evaluating multiple sentence-level sentiment machine learning (ML) models. Results indicate that off-the-shelf sentiment analysis tools fail in identifying clinically positive or negative polarity, and that the definition of clinical sentiment that we provide is learnable with relatively small amounts of training data. This project is an initial step towards further refining sentiment analysis methods for clinical use. Our long-term objective is to incorporate the results of this project as part of a machine learning model that predicts inpatient readmission risk. We hope that this work will initiate a discussion concerning domain adaptation of sentiment analysis to the clinical setting.

pdf bib
Computational Linguistics Applications for Multimedia Services
Kyeongmin Rim | Kelley Lynch | James Pustejovsky
Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

We present Computational Linguistics Applications for Multimedia Services (CLAMS), a platform that provides access to computational content analysis tools for archival multimedia material that appear in different media, such as text, audio, image, and video. The primary goal of CLAMS is: (1) to develop an interchange format between multimodal metadata generation tools to ensure interoperability between tools; (2) to provide users with a portable, user-friendly workflow engine to chain selected tools to extract meaningful analyses; and (3) to create a public software development kit (SDK) for developers that eases deployment of analysis tools within the CLAMS platform. CLAMS is designed to help archives and libraries enrich the metadata associated with their mass-digitized multimedia collections, that would otherwise be largely unsearchable.

pdf bib
Modeling Quantification and Scope in Abstract Meaning Representations
James Pustejovsky | Ken Lai | Nianwen Xue
Proceedings of the First International Workshop on Designing Meaning Representations

In this paper, we propose an extension to Abstract Meaning Representations (AMRs) to encode scope information of quantifiers and negation, in a way that overcomes the semantic gaps of the schema while maintaining its cognitive simplicity. Specifically, we address three phenomena not previously part of the AMR specification: quantification, negation (generally), and modality. The resulting representation, which we call “Uniform Meaning Representation” (UMR), adopts the predicative core of AMR and embeds it under a “scope” graph when appropriate. UMR representations differ from other treatments of quantification and modal scope phenomena in two ways: (a) they are more transparent; and (b) they specify default scope when possible.‘

pdf bib
VerbNet Representations: Subevent Semantics for Transfer Verbs
Susan Windisch Brown | Julia Bonn | James Gung | Annie Zaenen | James Pustejovsky | Martha Palmer
Proceedings of the First International Workshop on Designing Meaning Representations

This paper announces the release of a new version of the English lexical resource VerbNet with substantially revised semantic representations designed to facilitate computer planning and reasoning based on human language. We use the transfer of possession and transfer of information event representations to illustrate both the general framework of the representations and the types of nuances the new representations can capture. These representations use a Generative Lexicon-inspired subevent structure to track attributes of event participants across time, highlighting oppositions and temporal and causal relations among the subevents.

2018

pdf bib
Integrating Generative Lexicon Event Structures into VerbNet
Susan Windisch Brown | James Pustejovsky | Annie Zaenen | Martha Palmer
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Bridging the LAPPS Grid and CLARIN
Erhard Hinrichs | Nancy Ide | James Pustejovsky | Jan Hajič | Marie Hinrichs | Mohammad Fazleh Elahi | Keith Suderman | Marc Verhagen | Kyeongmin Rim | Pavel Straňák | Jozef Mišutka
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Towards an ISO Standard for the Annotation of Quantification
Harry Bunt | James Pustejovsky | Kiyong Lee
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
An Evaluation Framework for Multimodal Interaction
Nikhil Krishnaswamy | James Pustejovsky
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Proceedings of the First International Workshop on Spatial Language Understanding
Parisa Kordjamshidi | Archna Bhatia | James Pustejovsky | Marie-Francine Moens
Proceedings of the First International Workshop on Spatial Language Understanding

pdf bib
Every Object Tells a Story
James Pustejovsky | Nikhil Krishnaswamy
Proceedings of the Workshop Events and Stories in the News 2018

Most work within the computational event modeling community has tended to focus on the interpretation and ordering of events that are associated with verbs and event nominals in linguistic expressions. What is often overlooked in the construction of a global interpretation of a narrative is the role contributed by the objects participating in these structures, and the latent events and activities conventionally associated with them. Recently, the analysis of visual images has also enriched the scope of how events can be identified, by anchoring both linguistic expressions and ontological labels to segments, subregions, and properties of images. By semantically grounding event descriptions in their visualization, the importance of object-based attributes becomes more apparent. In this position paper, we look at the narrative structure of objects: that is, how objects reference events through their intrinsic attributes, such as affordances, purposes, and functions. We argue that, not only do objects encode conventionalized events, but that when they are composed within specific habitats, the ensemble can be viewed as modeling coherent event sequences, thereby enriching the global interpretation of the evolving narrative being constructed.

pdf bib
The Revision of ISO-Space,Focused on the Movement Link
Kiyong Lee | James Pustejovsky | Harry Bunt
Proceedings 14th Joint ACL - ISO Workshop on Interoperable Semantic Annotation

pdf bib
Analysis of Risk Factor Domains in Psychosis Patient Health Records
Eben Holderness | Nicholas Miller | Kirsten Bolton | Philip Cawkwell | Marie Meteer | James Pustejovsky | Mei Hua-Hall
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

Readmission after discharge from a hospital is disruptive and costly, regardless of the reason. However, it can be particularly problematic for psychiatric patients, so predicting which patients may be readmitted is critically important but also very difficult. Clinical narratives in psychiatric electronic health records (EHRs) span a wide range of topics and vocabulary; therefore, a psychiatric readmission prediction model must begin with a robust and interpretable topic extraction component. We created a data pipeline for using document vector similarity metrics to perform topic extraction on psychiatric EHR data in service of our long-term goal of creating a readmission risk classifier. We show initial results for our topic extraction model and identify additional features we will be incorporating in the future.

2017

pdf bib
Communicating and Acting: Understanding Gesture in Simulation Semantics
Nikhil Krishnaswamy | Pradyumna Narayana | Isaac Wang | Kyeongmin Rim | Rahul Bangar | Dhruva Patil | Gururaj Mulay | Ross Beveridge | Jaime Ruiz | Bruce Draper | James Pustejovsky
IWCS 2017 — 12th International Conference on Computational Semantics — Short papers

pdf bib
Creating Common Ground through Multimodal Simulations
James Pustejovsky | Nikhil Krishnaswamy | Bruce Draper | Pradyumna Narayana | Rahul Bangar
Proceedings of the IWCS workshop on Foundations of Situated and Multimodal Communication

pdf bib
Enriching the Notion of Path in ISO-Space
James Pustejovsky | Kiyong Lee
Proceedings of the 13th Joint ISO-ACL Workshop on Interoperable Semantic Annotation (ISA-13)

pdf bib
SemEval-2017 Task 12: Clinical TempEval
Steven Bethard | Guergana Savova | Martha Palmer | James Pustejovsky
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Clinical TempEval 2017 aimed to answer the question: how well do systems trained on annotated timelines for one medical condition (colon cancer) perform in predicting timelines on another medical condition (brain cancer)? Nine sub-tasks were included, covering problems in time expression identification, event expression identification and temporal relation identification. Participant systems were evaluated on clinical and pathology notes from Mayo Clinic cancer patients, annotated with an extension of TimeML for the clinical domain. 11 teams participated in the tasks, with the best systems achieving F1 scores above 0.55 for time expressions, above 0.70 for event expressions, and above 0.40 for temporal relations. Most tasks observed about a 20 point drop over Clinical TempEval 2016, where systems were trained and evaluated on the same domain (colon cancer).

pdf bib
Lexical Factorization and Syntactic Behavior
James Pustejovsky | Aravind Joshi
Linguistic Issues in Language Technology, Volume 15, 2017

In this paper, we examine the correlation between lexical semantics and the syntactic realization of the different components of a word’s meaning in natural language. More specifically, we will explore the effect that lexical factorization in verb semantics has on the suppression or expression of semantic features within the sentence. Factorization was a common analytic tool employed in early generative linguistic approaches to lexical decomposition, and continues to play a role in contemporary semantics, in various guises and modified forms. Building on the unpublished analysis of verbs of seeing in Joshi (1972), we argue here that the significance of lexical factorization is twofold: first, current models of verb meaning owe much of their insight to factor-based theories of meaning; secondly, the factorization properties of a lexical item appear to influence, both directly and indirectly, the possible syntactic expressibility of arguments and adjuncts in sentence composition. We argue that this information can be used to compute what we call the factor expression likelihood (FEL) associated with a verb in a sentence. This is the likelihood that the overt syntactic expression of a factor will cooccur with the verb. This has consequences for the compositional mechanisms responsible for computing the meaning of the sentence, as well as significance in the creation of computational models attempting to capture linguistic behavior over large corpora.

pdf bib
Integrating Vision and Language Datasets to Measure Word Concreteness
Gitit Kehat | James Pustejovsky
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

We present and take advantage of the inherent visualizability properties of words in visual corpora (the textual components of vision-language datasets) to compute concreteness scores for words. Our simple method does not require hand-annotated concreteness score lists for training, and yields state-of-the-art results when evaluated against concreteness scores lists and previously derived scores, as well as when used for metaphor detection.

pdf bib
Building Multimodal Simulations for Natural Language
James Pustejovsky | Nikhil Krishnaswamy
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts

In this tutorial, we introduce a computational framework and modeling language (VoxML) for composing multimodal simulations of natural language expressions within a 3D simulation environment (VoxSim). We demonstrate how to construct voxemes, which are visual object representations of linguistic entities. We also show how to compose events and actions over these objects, within a restricted domain of dynamics. This gives us the building blocks to simulate narratives of multiple events or participate in a multimodal dialogue with synthetic agents in the simulation environment. To our knowledge, this is the first time such material has been presented as a tutorial within the CL community.This will be of relevance to students and researchers interested in modeling actionable language, natural language communication with agents and robots, spatial and temporal constraint solving through language, referring expression generation, embodied cognition, as well as minimal model creation.Multimodal simulation of language, particularly motion expressions, brings together a number of existing lines of research from the computational linguistic, semantics, robotics, and formal logic communities, including action and event representation (Di Eugenio, 1991), modeling gestural correlates to NL expressions (Kipp et al., 2007; Neff et al., 2008), and action event modeling (Kipper and Palmer, 2000; Yang et al., 2015). We combine an approach to event modeling with a scene generation approach akin to those found in work by (Coyne and Sproat, 2001; Siskind, 2011; Chang et al., 2015). Mapping natural language expressions through a formal model and a dynamic logic interpretation into a visualization of the event described provides an environment for grounding concepts and referring expressions that is interpretable by both a computer and a human user. This opens a variety of avenues for humans to communicate with computerized agents and robots, as in (Matuszek et al., 2013; Lauria et al., 2001), (Forbes et al., 2015), and (Deits et al., 2013; Walter et al., 2013; Tellex et al., 2014). Simulation and automatic visualization of events from natural language descriptions and supplementary modalities, such as gestures, allows humans to use their native capabilities as linguistic and visual interpreters to collaborate on tasks with an artificial agent or to put semantic intuitions to the test in an environment where user and agent share a common context.In previous work (Pustejovsky and Krishnaswamy, 2014; Pustejovsky, 2013a), we introduced a method for modeling natural language expressions within a 3D simulation environment built on top of the game development platform Unity (Goldstone, 2009). The goal of that work was to evaluate, through explicit visualizations of linguistic input, the semantic presuppositions inherent in the different lexical choices of an utterance. This work led to two additional lines of research: an explicit encoding for how an object is itself situated relative to its environment; and an operational characterization of how an object changes its location or how an agent acts on an object over time, e.g., its affordance structure. The former has developed into a semantic notion of situational context, called a habitat (Pustejovsky, 2013a; McDonald and Pustejovsky, 2014), while the latter is addressed by dynamic interpretations of event structure (Pustejovsky and Moszkowicz, 2011; Pustejovsky and Krishnaswamy, 2016b; Pustejovsky, 2013b).The requirements on building a visual simulation from language include several components. We require a rich type system for lexical items and their composition, as well as a language for modeling the dynamics of events, based on Generative Lexicon (GL). Further, a minimal embedding space (MES) for the simulation must be determined. This is the 3D region within which the state is configured or the event unfolds. Object-based attributes for participants in a situation or event also need to be specified; e.g., orientation, relative size, default position or pose, etc. The simulation establishes an epistemic condition on the object and event rendering, imposing an implicit point of view (POV). Finally, there must be some sort of agent-dependent embodiment; this determines the relative scaling of an agent and its event participants and their surroundings, as it engages in the environment.In order to construct a robust simulation from linguistic input, an event and its participants must be embedded within an appropriate minimal embedding space. This must sufficiently enclose the event localization, while optionally including space enough for a frame of reference for the event (the viewer’s perspective).We first describe the formal multimodal foundations for the modeling language, VoxML, which creates a minimal simulation from the linguistic input interpreted by the multimodal language, DITL. We then describe VoxSim, the compositional modeling and simulation environment, which maps the minimal VoxML model of the linguistic utterance to a simulation in Unity. This knowledge includes specification of object affordances, e.g., what actions are possible or enabled by use an object.VoxML (Pustejovsky and Krishnaswamy, 2016b; Pustejovsky and Krishnaswamy, 2016a) encodes semantic knowledge of real-world objects represented as 3D models, and of events and attributes related to and enacted over these objects. VoxML goes beyond the limitations of existing 3D visual markup languages by allowing for the encoding of a broad range of semantic knowledge that can be exploited by a simulation platform such as VoxSim.VoxSim (Krishnaswamy and Pustejovsky, 2016a; Krishnaswamy and Pustejovsky, 2016b) uses object and event semantic knowledge to generate animated scenes in real time without a complex animation interface. It uses the Unity game engine for graphics and I/O processing and takes as input a simple natural language utterance. The parsed utterance is semantically interpreted and transformed into a hybrid dynamic logic representation (DITL), and used to generate a minimal simulation of the event when composed with VoxML knowledge. 3D assets and VoxML-modeled nominal objects and events are created with other Unity-based tools, and VoxSim uses the entirety of the composed information to render a visualization of the described event.The tutorial participants will learn how to build simulatable objects, compose dynamic event structures, and simulate the events running over the objects. The toolkit consists of object and program (event) composers and the runtime environment, which allows for the user to directly manipulate the objects, or interact with synthetic agents in VoxSim. As a result of this tutorial, the student will acquire the following skill set: take a novel object geometry from a library and model it in VoxML; apply existing library behaviors (actions or events) to the new VoxML object; model attributes of new objects as well as introduce novel attributes; model novel behaviors over objects.The tutorial modules will be conducted within a build image of the software. Access to libraries will be provided by the instructors. No knowledge of 3D modeling or the Unity platform will be required.

2016

pdf bib
The Language Application Grid and Galaxy
Nancy Ide | Keith Suderman | James Pustejovsky | Marc Verhagen | Christopher Cieri
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The NSF-SI2-funded LAPPS Grid project is a collaborative effort among Brandeis University, Vassar College, Carnegie-Mellon University (CMU), and the Linguistic Data Consortium (LDC), which has developed an open, web-based infrastructure through which resources can be easily accessed and within which tailored language services can be efficiently composed, evaluated, disseminated and consumed by researchers, developers, and students across a wide variety of disciplines. The LAPPS Grid project recently adopted Galaxy (Giardine et al., 2005), a robust, well-developed, and well-supported front end for workflow configuration, management, and persistence. Galaxy allows data inputs and processing steps to be selected from graphical menus, and results are displayed in intuitive plots and summaries that encourage interactive workflows and the exploration of hypotheses. The Galaxy workflow engine provides significant advantages for deploying pipelines of LAPPS Grid web services, including not only means to create and deploy locally-run and even customized versions of the LAPPS Grid as well as running the LAPPS Grid in the cloud, but also access to a huge array of statistical and visualization tools that have been developed for use in genomics research.

pdf bib
VoxML: A Visualization Modeling Language
James Pustejovsky | Nikhil Krishnaswamy
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present the specification for a modeling language, VoxML, which encodes semantic knowledge of real-world objects represented as three-dimensional models, and of events and attributes related to and enacted over these objects.VoxML is intended to overcome the limitations of existing 3D visual markup languages by allowing for the encoding of a broad range of semantic knowledge that can be exploited by a variety of systems and platforms, leading to multimodal simulations of real-world scenarios using conceptual objects that represent their semantic values

pdf bib
The Development of Multimodal Lexical Resources
James Pustejovsky | Tuan Do | Gitit Kehat | Nikhil Krishnaswamy
Proceedings of the Workshop on Grammar and Lexicon: interactions and interfaces (GramLex)

Human communication is a multimodal activity, involving not only speech and written expressions, but intonation, images, gestures, visual clues, and the interpretation of actions through perception. In this paper, we describe the design of a multimodal lexicon that is able to accommodate the diverse modalities that present themselves in NLP applications. We have been developing a multimodal semantic representation, VoxML, that integrates the encoding of semantic, visual, gestural, and action-based features associated with linguistic expressions.

pdf bib
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016)
Yohei Murakami | Donghui Lin | Nancy Ide | James Pustejovsky
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016)

pdf bib
LAPPS/Galaxy: Current State and Next Steps
Nancy Ide | Keith Suderman | Eric Nyberg | James Pustejovsky | Marc Verhagen
Proceedings of the Third International Workshop on Worldwide Language Service Infrastructure and Second Workshop on Open Infrastructures and Analysis Frameworks for Human Language Technologies (WLSI/OIAF4HLT2016)

The US National Science Foundation (NSF) SI2-funded LAPPS/Galaxy project has developed an open-source platform for enabling complex analyses while hiding complexities associated with underlying infrastructure, that can be accessed through a web interface, deployed on any Unix system, or run from the cloud. It provides sophisticated tool integration and history capabilities, a workflow system for building automated multi-step analyses, state-of-the-art evaluation capabilities, and facilities for sharing and publishing analyses. This paper describes the current facilities available in LAPPS/Galaxy and outlines the project’s ongoing activities to enhance the framework.

pdf bib
VoxSim: A Visual Platform for Modeling Motion Language
Nikhil Krishnaswamy | James Pustejovsky
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

Much existing work in text-to-scene generation focuses on generating static scenes. By introducing a focus on motion verbs, we integrate dynamic semantics into a rich formal model of events to generate animations in real time that correlate with human conceptions of the event described. This paper presents a working system that generates these animated scenes over a test set, discussing challenges encountered and describing the solutions implemented.

pdf bib
SemEval-2016 Task 12: Clinical TempEval
Steven Bethard | Guergana Savova | Wei-Te Chen | Leon Derczynski | James Pustejovsky | Marc Verhagen
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
The Semantics of Image Annotation
Julia Bosque-Gil | James Pustejovsky
Proceedings of the 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-11)

pdf bib
SemEval-2015 Task 5: QA TempEval - Evaluating Temporal Information Understanding with Question Answering
Hector Llorens | Nathanael Chambers | Naushad UzZaman | Nasrin Mostafazadeh | James Allen | James Pustejovsky
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
SemEval-2015 Task 6: Clinical TempEval
Steven Bethard | Leon Derczynski | Guergana Savova | James Pustejovsky | Marc Verhagen
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
SemEval-2015 Task 8: SpaceEval
James Pustejovsky | Parisa Kordjamshidi | Marie-Francine Moens | Aaron Levine | Seth Dworman | Zachary Yocum
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
Temporal Annotation in the Clinical Domain
William F. Styler IV | Steven Bethard | Sean Finan | Martha Palmer | Sameer Pradhan | Piet C de Groen | Brad Erickson | Timothy Miller | Chen Lin | Guergana Savova | James Pustejovsky
Transactions of the Association for Computational Linguistics, Volume 2

This article discusses the requirements of a formal specification for the annotation of temporal information in clinical narratives. We discuss the implementation and extension of ISO-TimeML for annotating a corpus of clinical notes, known as the THYME corpus. To reflect the information task and the heavily inference-based reasoning demands in the domain, a new annotation guideline has been developed, “the THYME Guidelines to ISO-TimeML (THYME-TimeML)”. To clarify what relations merit annotation, we distinguish between linguistically-derived and inferentially-derived temporal orderings in the text. We also apply a top performing TempEval 2013 system against this new resource to measure the difficulty of adapting systems to the clinical domain. The corpus is available to the community and has been proposed for use in a SemEval 2015 task.

pdf bib
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)
Oleksandr Kolomiyets | Marie-Francine Moens | Martha Palmer | James Pustejovsky | Steven Bethard
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)

pdf bib
The Language Application Grid Web Service Exchange Vocabulary
Nancy Ide | James Pustejovsky | Keith Suderman | Marc Verhagen
Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT

pdf bib
A Conceptual Framework of Online Natural Language Processing Pipeline Application
Chunqi Shi | James Pustejovsky | Marc Verhagen
Proceedings of the Workshop on Open Infrastructures and Analysis Frameworks for HLT

pdf bib
Extracting Aspects and Polarity from Patents
Peter Anick | Marc Verhagen | James Pustejovsky
Proceedings of the COLING Workshop on Synchronic and Diachronic Approaches to Analyzing Technical Language

pdf bib
Generating Simulations of Motion Events from Verbal Descriptions
James Pustejovsky | Nikhil Krishnaswamy
Proceedings of the Third Joint Conference on Lexical and Computational Semantics (*SEM 2014)

pdf bib
Image Annotation with ISO-Space: Distinguishing Content from Structure
James Pustejovsky | Zachary Yocum
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Natural language descriptions of visual media present interesting problems for linguistic annotation of spatial information. This paper explores the use of ISO-Space, an annotation specification to capturing spatial information, for encoding spatial relations mentioned in descriptions of images. Especially, we focus on the distinction between references to representational content and structural components of images, and the utility of such a distinction within a compositional semantics. We also discuss how such a structure-content distinction within the linguistic annotation can be leveraged to compute further inferences about spatial configurations depicted by images with verbal captions. We construct a composition table to relate content-based relations to structure-based relations in the image, as expressed in the captions. While still preliminary, our initial results suggest that a weak composition table is both sound and informative for deriving new spatial relations.

pdf bib
Identification of Technology Terms in Patents
Peter Anick | Marc Verhagen | James Pustejovsky
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Natural language analysis of patents holds promise for the development of tools designed to assist analysts in the monitoring of emerging technologies. One component of such tools is the identification of technology terms. We describe an approach to the discovery of technology terms using supervised machine learning and evaluate its performance on subsets of patents in three languages: English, German, and Chinese.

pdf bib
The Language Application Grid
Nancy Ide | James Pustejovsky | Christopher Cieri | Eric Nyberg | Di Wang | Keith Suderman | Marc Verhagen | Jonathan Wright
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The Language Application (LAPPS) Grid project is establishing a framework that enables language service discovery, composition, and reuse and promotes sustainability, manageability, usability, and interoperability of natural language Processing (NLP) components. It is based on the service-oriented architecture (SOA), a more recent, web-oriented version of the “pipeline” architecture that has long been used in NLP for sequencing loosely-coupled linguistic analyses. The LAPPS Grid provides access to basic NLP processing tools and resources and enables pipelining such tools to create custom NLP applications, as well as composite services such as question answering and machine translation together with language resources such as mono- and multi-lingual corpora and lexicons that support NLP. The transformative aspect of the LAPPS Grid is that it orchestrates access to and deployment of language resources and processing functions available from servers around the globe and enables users to add their own language resources, services, and even service grids to satisfy their particular needs.

2013

pdf bib
Capturing Motion in ISO-SpaceBank
James Pustejovsky | Zachary Yocum
Proceedings of the 9th Joint ISO - ACL SIGSEM Workshop on Interoperable Semantic Annotation

pdf bib
Inference Patterns with Intensional Adjectives
James Pustejovsky
Proceedings of the 9th Joint ISO - ACL SIGSEM Workshop on Interoperable Semantic Annotation

pdf bib
Where Things Happen: On the Semantics of Event Localization
James Pustejovsky
Proceedings of the IWCS 2013 Workshop on Computational Models of Spatial Language Interpretation and Generation (CoSLI-3)

pdf bib
Proceedings of the 6th International Conference on Generative Approaches to the Lexicon (GL2013)
James Pustejovsky
Proceedings of the 6th International Conference on Generative Approaches to the Lexicon (GL2013)

pdf bib
Dynamic Event Structure and Habitat Theory
James Pustejovsky
Proceedings of the 6th International Conference on Generative Approaches to the Lexicon (GL2013)

pdf bib
Informativeness Constraints and Compositionality
Olga Batiukova | James Pustejovsky
Proceedings of the 6th International Conference on Generative Approaches to the Lexicon (GL2013)

pdf bib
SemEval-2013 Task 1: TempEval-3: Evaluating Time Expressions, Events, and Temporal Relations
Naushad UzZaman | Hector Llorens | Leon Derczynski | James Allen | Marc Verhagen | James Pustejovsky
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2012

pdf bib
The Role of Linguistic Models and Language Annotation in Feature Selection for Machine Learning
James Pustejovsky
Proceedings of the Sixth Linguistic Annotation Workshop

pdf bib
The TARSQI Toolkit
Marc Verhagen | James Pustejovsky
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present and demonstrate the updated version of the TARSQI Toolkit, a suite of temporal processing modules that extract temporal information from natural language texts. It parses the document and identifies temporal expressions, recognizes events, anchor events to temporal expressions and orders events relative to each other. The toolkit was previously demonstrated at COLING 2008, but has since seen substantial changes including: (1) incorporation of a new time expression tagger, (2)~embracement of stand-off annotation, (3) application to the medical domain and (4) introduction of narrative containers.

pdf bib
Word Sense Inventories by Non-Experts.
Anna Rumshisky | Nick Botchan | Sophie Kushkuley | James Pustejovsky
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper, we explore different strategies for implementing a crowdsourcing methodology for a single-step construction of an empirically-derived sense inventory and the corresponding sense-annotated corpus. We report on the crowdsourcing experiments using implementation strategies with different HIT costs, worker qualification testing, and other restrictions. We describe multiple adjustments required to ensure successful HIT design, given significant changes within the crowdsourcing community over the last three years.

pdf bib
ATLIS: Identifying Locational Information in Text Automatically
John Vogel | Marc Verhagen | James Pustejovsky
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

ATLIS (short for “ ATLIS Tags Locations in Strings”) is a tool being developed using a maximum-entropy machine learning model for automatically identifying information relating to spatial and locational information in natural language text. It is being developed in parallel with the ISO-Space standard for annotation of spatial information (Pustejovsky, Moszkowicz & Verhagen 2011). The goal of ATLIS is to be able to take in a document as raw text and mark it up with ISO-Space annotation data, so that another program could use the information in a standardized format to reason about the semantics of the spatial information in the document. The tool (as well as ISO-Space itself) is still in the early stages of development. At present it implements a subset of the proposed ISO-Space annotation standard: it identifies expressions that refer to specific places, as well as identifying prepositional constructions that indicate a spatial relationship between two objects. In this paper, the structure of the ATLIS tool is presented, along with preliminary evaluations of its performance.

pdf bib
The Role of Model Testing in Standards Development: The Case of ISO-Space
James Pustejovsky | Jessica Moszkowicz
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In this paper, we describe the methodology being used to develop certain aspects of ISO-Space, an annotation language for encoding spatial and spatiotemporal information as expressed in natural language text. After reviewing the requirements of a specification for capturing such knowledge from linguistic descriptions, we describe how ISO-Space has developed to meet the needs of the specification. ISO-Space is an emerging resource that is being developed in the context of an iterative effort to test the specification model with annotation, a methodology called MAMA (Model-Annotate-Model-Annotate) (Pustejovsky and Stubbs, 2012). We describe the genres of text that are being used in a pilot annotation study, in order to both refine and enrich the specification language by way of crowd sourcing simple annotation tasks with Amazon's Mechanical Turk Service.

pdf bib
Are You Sure That This Happened? Assessing the Factuality Degree of Events in Text
Roser Saurí | James Pustejovsky
Computational Linguistics, Volume 38, Issue 2 - June 2012

pdf bib
Qualitative Modeling of Spatial Prepositions and Motion Expressions
Inderjeet Mani | James Pustejovsky
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

2011

pdf bib
Medstract - The Next Generation
Marc Verhagen | James Pustejovsky
Proceedings of BioNLP 2011 Workshop

pdf bib
Increasing Informativeness in Temporal Annotation
James Pustejovsky | Amber Stubbs
Proceedings of the 5th Linguistic Annotation Workshop

2010

pdf bib
SemEval-2010 Task 7: Argument Selection and Coercion
James Pustejovsky | Anna Rumshisky | Alex Plotnick | Elisabetta Jezek | Olga Batiukova | Valeria Quochi
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
SemEval-2010 Task 13: TempEval-2
Marc Verhagen | Roser Saurí | Tommaso Caselli | James Pustejovsky
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
ISO-TimeML: An International Standard for Semantic Annotation
James Pustejovsky | Kiyong Lee | Harry Bunt | Laurent Romary
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we present ISO-TimeML, a revised and interoperable version of the temporal markup language, TimeML. We describe the changes and enrichments made, while framing the effort in a more general methodology of semantic annotation. In particular, we assume a principled distinction between the annotation of an expression and the representation which that annotation denotes. This involves not only the specification of an annotation language for a particular phenomenon, but also the development of a meta-model that allows one to interpret the syntactic expressions of the specification semantically.

pdf bib
A Road Map for Interoperable Language Resource Metadata
Christopher Cieri | Khalid Choukri | Nicoletta Calzolari | D. Terence Langendoen | Johannes Leveling | Martha Palmer | Nancy Ide | James Pustejovsky
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

LRs remain expensive to create and thus rare relative to demand across languages and technology types. The accidental re-creation of an LR that already exists is a nearly unforgivable waste of scarce resources that is unfortunately not so easy to avoid. The number of catalogs the HLT researcher must search, with their different formats, make it possible to overlook an existing resource. This paper sketches the sources of this problem and outlines a proposal to rectify along with a new vision of LR cataloging that will to facilitates the documentation and exploitation of a much wider range of LRs than previously considered.

2009

pdf bib
SemEval-2010 Task 7: Argument Selection and Coercion
James Pustejovsky | Anna Rumshisky
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf bib
SemEval-2010 Task 13: Evaluating Events, Time Expressions, and Temporal Relations (TempEval-2)
James Pustejovsky | Marc Verhagen
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

pdf bib
The SILT and FlaReNet International Collaboration for Interoperability
Nancy Ide | James Pustejovsky | Nicoletta Calzolari | Claudia Soria
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

pdf bib
GLML: Annotating Argument Selection and Coercion
James Pustejovsky | Jessica Moszkowicz | Olga Batiukova | Anna Rumshisky
Proceedings of the Eight International Conference on Computational Semantics

2008

pdf bib
Integrating Motion Predicate Classes with Spatial and Temporal Annotations
James Pustejovsky | Jessica L. Moszkowicz
Coling 2008: Companion volume: Posters

pdf bib
Temporal Processing with the TARSQI Toolkit
Marc Verhagen | James Pustejovsky
Coling 2008: Companion volume: Demonstrations

2007

pdf bib
Combining Independent Syntactic and Semantic Annotation Schemes
Marc Verhagen | Amber Stubbs | James Pustejovsky
Proceedings of the Linguistic Annotation Workshop

pdf bib
SemEval-2007 Task 15: TempEval Temporal Relation Identification
Marc Verhagen | Robert Gaizauskas | Frank Schilder | Mark Hepple | Graham Katz | James Pustejovsky
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
Automatically Identifying the Arguments of Discourse Connectives
Ben Wellner | James Pustejovsky
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2006

pdf bib
Machine Learning of Temporal Relations
Inderjeet Mani | Marc Verhagen | Ben Wellner | Chong Min Lee | James Pustejovsky
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Proceedings of the Workshop on Annotating and Reasoning about Time and Events
Branimir Boguraev | Rafael Muñoz | James Pustejovsky
Proceedings of the Workshop on Annotating and Reasoning about Time and Events

pdf bib
Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources
Ben Wellner | James Pustejovsky | Catherine Havasi | Anna Rumshisky | Roser Saurí
Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue

pdf bib
BULB: A Unified Lexical Browser
Catherine Havasi | James Pustejovsky | Marc Verhagen
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Natural language processing researchers currently have access to a wealth of information about words and word senses. This presents problems as well as resources, as it is often difficult to search through and coordinate lexical information across various data sources. We have approached this problem by creating a shared environment for various lexical resources. This browser, BULB (Brandeis Unified Lexical Browser) and its accompanying front-end provides the NLP researcher with a coordinated display from many of the available lexical resources, focusing, in particular, on a newly developed lexical database, the Brandeis Semantic Ontology (BSO). BULB is a module-based browser focusing on the interaction and display of modules from existing NLP tools. We discuss the BSO, PropBank, FrameNet, WordNet, and CQP, as well as other modules which will extend the system. We then outline future extensions to this work and present a release schedule for BULB.

pdf bib
Towards a Generative Lexical Resource: The Brandeis Semantic Ontology
James Pustejovsky | Catherine Havasi | Jessica Littman | Anna Rumshisky | Marc Verhagen
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we describe the structure and development of the Brandeis Semantic Ontology (BSO), a large generative lexicon ontology and lexical database. The BSO has been designed to allow for more widespread access to Generative Lexicon-based lexical resources and help researchers in a variety of computational tasks. The specification of the type system used in the BSO largely follows that proposed by the SIMPLE specification (Busa et al., 2001), which was adopted by the EU-sponsored SIMPLE project (Lenci et al., 2000).

pdf bib
Annotation of Temporal Relations with Tango
Marc Verhagen | Robert Knippen | Inderjeet Mani | James Pustejovsky
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Temporal annotation is a complex task characterized by low markup speed and low inter-annotator agreements scores. Tango is a graphical annotation tool for temporal relations. It is developed for the TimeML annotation language and allows annotators to build a graph that resembles a timeline. Temporal relations are added by selecting events and drawing labeled arrows between them. Tango is integrated with a temporal closure component and includes features like SmartLink, user prompting and automatic linking of time expressions. Tango has been used to create two corpora with temporal annotation, TimeBank and the AQUAINT Opinion corpus.

pdf bib
Inducing Sense-Discriminating Context Patterns from Sense-Tagged Corpora
Anna Rumshisky | James Pustejovsky
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Traditionally, context features used in word sense disambiguation are based on collocation statistics and use only minimal syntactic and semantic information. Corpus Pattern Analysis is a technique for producing knowledge-rich context features that capture sense distinctions. It involves (1) identifying sense-carrying context patterns and using the derived context features to discriminate between the unseen instances. Both stages require manual seeding. In this paper, we show how to automate inducing sense-discriminating context features from a sense-tagged corpus.

pdf bib
SlinkET: A Partial Modal Parser for Events
Roser Saurí | Marc Verhagen | James Pustejovsky
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

We present SlinkET, a parser for identifying contexts of event modality in text developed within the TARSQI (Temporal Awareness and Reasoning Systems for Question Interpretation) research framework. SlinkET is grounded on TimeML, a specification language for capturing temporal and event related information in discourse, which provides an adequate foundation to handle event modality. SlinkET builds on top of a robust event recognizer, and provides each relevant event with a value that specifies the degree of certainty about its factuality; e.g., whether it has happened or holds (factive or counter-factive), whether it is being reported or witnessed by somebody else (evidential), or if it is introduced as a possibility (modal). It is based on well-established technology in the field (namely, finite-state techniques), and informed with corpus-induced knowledge that relies on basic information, such as morphological features, POS, and chunking. SlinkET is under continuing development and it currently achieves a performance ratio of 70% F1-measure.

2005

pdf bib
Evita: A Robust Event Recognizer For QA Systems
Roser Saurí | Robert Knippen | Marc Verhagen | James Pustejovsky
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank and Coreference
James Pustejovsky | Adam Meyers | Martha Palmer | Massimo Poesio
Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky

pdf bib
Adaptive String Similarity Metrics for Biomedical Reference Resolution
Ben Wellner | José Castaño | James Pustejovsky
Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics

pdf bib
Automating Temporal Annotation with TARSQI
Marc Verhagen | Inderjeet Mani | Roser Sauri | Jessica Littman | Robert Knippen | Seok B. Jang | Anna Rumshisky | John Phillips | James Pustejovsky
Proceedings of the ACL Interactive Poster and Demonstration Sessions

2004

pdf bib
Temporal Discourse Models for Narrative Structure
Inderjeet Mani | James Pustejovsky
Proceedings of the Workshop on Discourse Annotation

pdf bib
Automated Induction of Sense in Context
James Pustejovsky | Patrick Hanks | Anna Rumshisky
Proceedings of the 5th International Workshop on Linguistically Interpreted Corpora

pdf bib
Automated Induction of Sense in Context
James Pustejovsky | Patrick Hanks | Anna Rumshisky
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2003

pdf bib
Annotation of Temporal and Event Expressions
James Pustejovsky | Inderjeet Mani
Companion Volume of the Proceedings of HLT-NAACL 2003 - Tutorial Abstracts

2002

pdf bib
Creating Domain-specific Information Servers
James Pustejovsky
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
Medstract: creating large-scale information servers from biomedical texts
James Pustejovsky | José Castaño | Roser Saurí | Jason Zhang | Wei Luo
Proceedings of the ACL-02 Workshop on Natural Language Processing in the Biomedical Domain

1994

pdf bib
On the Proper Role of Coercion in Semantic Typing
James Pustejovsky | Pierrette Bouillon
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

pdf bib
Diderot: TIPSTER Program, Automatic Data Extraction from Text Utilizing Semantic Analysis
Y. Wilks | J. Pustejovsky | J. Cowie
Human Language Technology: Proceedings of a Workshop held at Plainsboro, New Jersey, March 8-11, 1994

1993

pdf bib
Lexical Semantic Techniques for Corpus Analysis
James Pustejovsky | Sabine Bergler | Peter Anick
Computational Linguistics, Volume 19, Number 2, June 1993, Special Issue on Using Large Corpora: II

pdf bib
Diderot: TIPSTER Program, Automatic Data Extraction from Text Utilizing Semantic Analysis
Y. Wilks | J. Pustejovsky | J. Cowie
Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 21-24, 1993

pdf bib
CRL/Brandeis: The Diderot System
Jim Cowie | Louise Guthrie | Jin Wang | William Ogden | James Pustejovsky | Rong Wang | Takahiro Wakao | Scott Waterman | Yorick Wilks
TIPSTER TEXT PROGRAM: PHASE I: Proceedings of a Workshop held at Fredricksburg, Virginia, September 19-23, 1993

pdf bib
CRL/Brandeis: Description of the Diderot System as Used for MUC-5
Jim Cowie | Louise Guthrie | Jin Wang | Rong Wang | Takahiro Wakao | James Pustejovsky | Scott Waterman
Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference Held in Baltimore, Maryland, August 25-27, 1993

pdf bib
Summary of Workshop on Lexicons for Text Extraction
James Pustejovsky
Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference Held in Baltimore, Maryland, August 25-27, 1993

1992

pdf bib
CRL/NMSU and Brandeis MucBruce: MUC-4 Test Results and Analysis
Jim Cowie | Louise Guthrie | Yorick Wilks | James Pustejovsky
Fourth Message Uunderstanding Conference (MUC-4): Proceedings of a Conference Held in McLean, Virginia, June 16-18, 1992

pdf bib
The Acquisition of Lexical Semantic Knowledge from Large Corpora
James Pustejovsky
Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992

pdf bib
Diderot: TIPSTER Program, Automatic Data Extraction from Text Utilizing Semantic Analysis
Y. Wilks | J. Pustejovsky | J. Cowie
Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992

1991

pdf bib
The Generative Lexicon
James Pustejovsky
ComputationaI Linguistics, Volume 17, Number 4, December 1991

1990

pdf bib
An Application of Lexical Semantics to Knowledge Acquisition from Corpora
Peter Anick | James Pustejovsky
COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics

pdf bib
Lexical Ambiguity and The Role of Knowledge Representation in Lexicon Design
Branimir Boguraev | James Pustejovsky
COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics

1989

pdf bib
Language and Spatial Cognition
James Pustejovsky
Computational Linguistics, Volume 15, Number 3, September 1989

1988

pdf bib
On The Semantic Interpretation of Nominals
James Pustejovsky | Peter G. Anick
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics

1987

pdf bib
On the Acquisition of Lexical Entries: The Perceptual Origin of Thematic Relations
James Pustejovsky
25th Annual Meeting of the Association for Computational Linguistics

pdf bib
Lexical Selection in the Process of Language Generation
James Pustejovsky | Sergei Nirenburg
25th Annual Meeting of the Association for Computational Linguistics

1986

pdf bib
TAG’s as a Grammatical Formalism for Generation
David D. McDonald | James D. Pustejovsky
Strategic Computing - Natural Language Workshop: Proceedings of a Workshop Held at Marina del Rey, California, May 1-2, 1986

1985

pdf bib
A Computational Theory of Prose Style for Natural Language Generation
David D. McDonald | James D. Pustejovsky
Second Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
TAG’s as a Grammatical Formalism for Generation
David D. McDonald | James D. Pustejovsky
23rd Annual Meeting of the Association for Computational Linguistics

bib
The Level Hypothesis in Discourse Analysis
James Pustejovsky
Proceedings of the first Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

Search
Co-authors