2024
pdf
bib
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024
Claire Bonial
|
Julia Bonn
|
Jena D. Hwang
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024
pdf
bib
abs
PropBank-Powered Data Creation: Utilizing Sense-Role Labelling to Generate Disaster Scenario Data
Mollie Frances Shichman
|
Claire Bonial
|
Taylor A. Hudson
|
Austin Blodgett
|
Francis Ferraro
|
Rachel Rudinger
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024
For human-robot dialogue in a search-and-rescue scenario, a strong knowledge of the conditions and objects a robot will face is essential for effective interpretation of natural language instructions. In order to utilize the power of large language models without overwhelming the limited storage capacity of a robot, we propose PropBank-Powered Data Creation. PropBank-Powered Data Creation is an expert-in-the-loop data generation pipeline which creates training data for disaster-specific language models. We leverage semantic role labeling and Rich Event Ontology resources to efficiently develop seed sentences for fine-tuning a smaller, targeted model that could operate onboard a robot for disaster relief. We developed 32 sentence templates, which we used to make 2 seed datasets of 175 instructions for earthquake search and rescue and train derailment response. We further leverage our seed datasets as evaluation data to test our baseline fine-tuned models.
pdf
bib
abs
Adjudicating LLMs as PropBank Adjudicators
Julia Bonn
|
Harish Tayyar Madabushi
|
Jena D. Hwang
|
Claire Bonial
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024
We evaluate the ability of large language models (LLMs) to provide PropBank semantic role label annotations across different realizations of the same verbs in transitive, intransitive, and middle voice constructions. In order to assess the meta-linguistic capabilities of LLMs as well as their ability to glean such capabilities through in-context learning, we evaluate the models in a zero-shot setting, in a setting where it is given three examples of another verb used in transitive, intransitive, and middle voice constructions, and finally in a setting where it is given the examples as well as the correct sense and roleset information. We find that zero-shot knowledge of PropBank annotation is almost nonexistent. The largest model evaluated, GPT-4, achieves the best performance in the setting where it is given both examples and the correct roleset in the prompt, demonstrating that larger models can ascertain some meta-linguistic capabilities through in-context learning. However, even in this setting, which is simpler than the task of a human in PropBank annotation, the model achieves only 48% accuracy in marking numbered arguments correctly. To ensure transparency and reproducibility, we publicly release our dataset and model responses.
pdf
bib
abs
A Construction Grammar Corpus of Varying Schematicity: A Dataset for the Evaluation of Abstractions in Language Models
Claire Bonial
|
Harish Tayyar Madabushi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Large Language Models (LLMs) have been developed without a theoretical framework, yet we posit that evaluating and improving LLMs will benefit from the development of theoretical frameworks that enable comparison of the structures of human language and the model of language built up by LLMs through the processing of text. In service of this goal, we develop the Construction Grammar Schematicity (“CoGS”) corpus of 10 distinct English constructions, where the constructions vary with respect to schematicity, or in other words the level to which constructional slots require specific, fixed lexical items, or can be filled with a variety of elements that fulfill a particular semantic role of the slot. Our corpus constructions are carefully curated to range from substantive, frozen constructions (e.g., Let-alone) to entirely schematic constructions (e.g., Resultative). The corpus was collected to allow us to probe LLMs for constructional information at varying levels of abstraction. We present our own probing experiments using this corpus, which clearly demonstrate that even the largest LLMs are limited to more substantive constructions and do not exhibit recognition of the similarity of purely schematic constructions. We publicly release our dataset, prompts, and associated model responses.
pdf
bib
abs
SCOUT: A Situated and Multi-Modal Human-Robot Dialogue Corpus
Stephanie M. Lukin
|
Claire Bonial
|
Matthew Marge
|
Taylor A. Hudson
|
Cory J. Hayes
|
Kimberly Pollard
|
Anthony Baker
|
Ashley N. Foots
|
Ron Artstein
|
Felix Gervits
|
Mitchell Abrams
|
Cassidy Henry
|
Lucia Donatelli
|
Anton Leuski
|
Susan G. Hill
|
David Traum
|
Clare Voss
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
We introduce the Situated Corpus Of Understanding Transactions (SCOUT), a multi-modal collection of human-robot dialogue in the task domain of collaborative exploration. The corpus was constructed from multiple Wizard-of-Oz experiments where human participants gave verbal instructions to a remotely-located robot to move and gather information about its surroundings. SCOUT contains 89,056 utterances and 310,095 words from 278 dialogues averaging 320 utterances per dialogue. The dialogues are aligned with the multi-modal data streams available during the experiments: 5,785 images and 30 maps. The corpus has been annotated with Abstract Meaning Representation and Dialogue-AMR to identify the speaker’s intent and meaning within an utterance, and with Transactional Units and Relations to track relationships between utterances to reveal patterns of the Dialogue Structure. We describe how the corpus and its annotations have been used to develop autonomous human-robot systems and enable research in open questions of how humans speak to robots. We release this corpus to accelerate progress in autonomous, situated, human-robot dialogue, especially in the context of navigation tasks where details about the environment need to be discovered.
2023
pdf
bib
Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023)
Claire Bonial
|
Harish Tayyar Madabushi
Proceedings of the First International Workshop on Construction Grammars and NLP (CxGs+NLP, GURT/SyntaxFest 2023)
pdf
bib
abs
What Else Do I Need to Know? The Effect of Background Information on Users’ Reliance on QA Systems
Navita Goyal
|
Eleftheria Briakou
|
Amanda Liu
|
Connor Baumler
|
Claire Bonial
|
Jeffrey Micher
|
Clare Voss
|
Marine Carpuat
|
Hal Daumé III
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
NLP systems have shown impressive performance at answering questions by retrieving relevant context. However, with the increasingly large models, it is impossible and often undesirable to constrain models’ knowledge or reasoning to only the retrieved context. This leads to a mismatch between the information that the models access to derive the answer and the information that is available to the user to assess the model predicted answer. In this work, we study how users interact with QA systems in the absence of sufficient information to assess their predictions. Further, we ask whether adding the requisite background helps mitigate users’ over-reliance on predictions. Our study reveals that users rely on model predictions even in the absence of sufficient information needed to assess the model’s correctness. Providing the relevant background, however, helps users better catch model errors, reducing over-reliance on incorrect predictions. On the flip side, background information also increases users’ confidence in their accurate as well as inaccurate judgments. Our work highlights that supporting users’ verification of QA predictions is an important, yet challenging, problem.
pdf
bib
abs
Abstract Meaning Representation for Grounded Human-Robot Communication
Claire Bonial
|
Julie Foresta
|
Nicholas C. Fung
|
Cory J. Hayes
|
Philip Osteen
|
Jacob Arkin
|
Benned Hedegaard
|
Thomas Howard
Proceedings of the Fourth International Workshop on Designing Meaning Representations
To collaborate effectively in physically situated tasks, robots must be able to ground concepts in natural language to the physical objects in the environment as well as their own capabilities. We describe the implementation and the demonstration of a system architecture that sup- ports tasking robots using natural language. In this architecture, natural language instructions are first handled by a dialogue management component, which provides feedback to the user and passes executable instructions along to an Abstract Meaning Representation (AMR) parser. The parse distills the action primitives and parameters of the instructed behavior in the form of a directed a-cyclic graph, passed on to the grounding component. We find AMR to be an efficient formalism for grounding the nodes of the graph using a Distributed Correspondence Graph. Thus, in our approach, the concepts of language are grounded to entities in the robot’s world model, which is populated by its sensors, thereby enabling grounded natural language communication. The demonstration of this system will allow users to issue navigation commands in natural language to direct a simulated ground robot (running the Robot Operating System) to various landmarks observed by the user within a simulated environment.
pdf
bib
abs
Use Defines Possibilities: Reasoning about Object Function to Interpret and Execute Robot Instructions
Mollie Shichman
|
Claire Bonial
|
Austin Blodgett
|
Taylor Hudson
|
Francis Ferraro
|
Rachel Rudinger
Proceedings of the 15th International Conference on Computational Semantics
Language models have shown great promise in common-sense related tasks. However, it remains unseen how they would perform in the context of physically situated human-robot interactions, particularly in disaster-relief sce- narios. In this paper, we develop a language model evaluation dataset with more than 800 cloze sentences, written to probe for the func- tion of over 200 objects. The sentences are divided into two tasks: an “easy” task where the language model has to choose between vo- cabulary with different functions (Task 1), and a “challenge” where it has to choose between vocabulary with the same function, yet only one vocabulary item is appropriate given real world constraints on functionality (Task 2). Dis- tilBERT performs with about 80% accuracy for both tasks. To investigate how annotator variability affected those results, we developed a follow-on experiment where we compared our original results with wrong answers chosen based on embedding vector distances. Those results showed increased precision across docu- ments but a 15% decrease in accuracy. We con- clude that language models do have a strong knowledge basis for object reasoning, but will require creative fine-tuning strategies in order to be successfully deployed.
2022
pdf
bib
abs
The Search for Agreement on Logical Fallacy Annotation of an Infodemic
Claire Bonial
|
Austin Blodgett
|
Taylor Hudson
|
Stephanie M. Lukin
|
Jeffrey Micher
|
Douglas Summers-Stay
|
Peter Sutor
|
Clare Voss
Proceedings of the Thirteenth Language Resources and Evaluation Conference
We evaluate an annotation schema for labeling logical fallacy types, originally developed for a crowd-sourcing annotation paradigm, now using an annotation paradigm of two trained linguist annotators. We apply the schema to a variety of different genres of text relating to the COVID-19 pandemic. Our linguist (as opposed to crowd-sourced) annotation of logical fallacies allows us to evaluate whether the annotation schema category labels are sufficiently clear and non-overlapping for both manual and, later, system assignment. We report inter-annotator agreement results over two annotation phases as well as a preliminary assessment of the corpus for training and testing a machine learning algorithm (Pattern-Exploiting Training) for fallacy detection and recognition. The agreement results and system performance underscore the challenging nature of this annotation task and suggest that the annotation schema and paradigm must be iteratively evaluated and refined in order to arrive at a set of annotation labels that can be reproduced by human annotators and, in turn, provide reliable training data for automatic detection and recognition systems.
2021
pdf
bib
abs
What Can a Generative Language Model Answer About a Passage?
Douglas Summers-Stay
|
Claire Bonial
|
Clare Voss
Proceedings of the 3rd Workshop on Machine Reading for Question Answering
Generative language models trained on large, diverse corpora can answer questions about a passage by generating the most likely continuation of the passage followed by a question/answer pair. However, accuracy rates vary depending on the type of question asked. In this paper we keep the passage fixed, and test with a wide variety of question types, exploring the strengths and weaknesses of the GPT-3 language model. We provide the passage and test questions as a challenge set for other language models.
pdf
bib
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop
Claire Bonial
|
Nianwen Xue
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop
pdf
bib
abs
Builder, we have done it: Evaluating & Extending Dialogue-AMR NLU Pipeline for Two Collaborative Domains
Claire Bonial
|
Mitchell Abrams
|
David Traum
|
Clare Voss
Proceedings of the 14th International Conference on Computational Semantics (IWCS)
We adopt, evaluate, and improve upon a two-step natural language understanding (NLU) pipeline that incrementally tames the variation of unconstrained natural language input and maps to executable robot behaviors. The pipeline first leverages Abstract Meaning Representation (AMR) parsing to capture the propositional content of the utterance, and second converts this into “Dialogue-AMR,” which augments standard AMR with information on tense, aspect, and speech acts. Several alternative approaches and training datasets are evaluated for both steps and corresponding components of the pipeline, some of which outperform the original. We extend the Dialogue-AMR annotation schema to cover a different collaborative instruction domain and evaluate on both domains. With very little training data, we achieve promising performance in the new domain, demonstrating the scalability of this approach.
2020
pdf
bib
abs
Dialogue-AMR: Abstract Meaning Representation for Dialogue
Claire Bonial
|
Lucia Donatelli
|
Mitchell Abrams
|
Stephanie M. Lukin
|
Stephen Tratz
|
Matthew Marge
|
Ron Artstein
|
David Traum
|
Clare Voss
Proceedings of the Twelfth Language Resources and Evaluation Conference
This paper describes a schema that enriches Abstract Meaning Representation (AMR) in order to provide a semantic representation for facilitating Natural Language Understanding (NLU) in dialogue systems. AMR offers a valuable level of abstraction of the propositional content of an utterance; however, it does not capture the illocutionary force or speaker’s intended contribution in the broader dialogue context (e.g., make a request or ask a question), nor does it capture tense or aspect. We explore dialogue in the domain of human-robot interaction, where a conversational robot is engaged in search and navigation tasks with a human partner. To address the limitations of standard AMR, we develop an inventory of speech acts suitable for our domain, and present “Dialogue-AMR”, an enhanced AMR that represents not only the content of an utterance, but the illocutionary force behind it, as well as tense and aspect. To showcase the coverage of the schema, we use both manual and automatic methods to construct the “DialAMR” corpus—a corpus of human-robot dialogue annotated with standard AMR and our enriched Dialogue-AMR schema. Our automated methods can be used to incorporate AMR into a larger NLU pipeline supporting human-robot dialogue.
pdf
bib
abs
InfoForager: Leveraging Semantic Search with AMR for COVID-19 Research
Claire Bonial
|
Stephanie M. Lukin
|
David Doughty
|
Steven Hill
|
Clare Voss
Proceedings of the Second International Workshop on Designing Meaning Representations
This paper examines how Abstract Meaning Representation (AMR) can be utilized for finding answers to research questions in medical scientific documents, in particular, to advance the study of UV (ultraviolet) inactivation of the novel coronavirus that causes the disease COVID-19. We describe the development of a proof-of-concept prototype tool, InfoForager, which uses AMR to conduct a semantic search, targeting the meaning of the user question, and matching this to sentences in medical documents that may contain information to answer that question. This work was conducted as a sprint over a period of six weeks, and reveals both promising results and challenges in reducing the user search time relating to COVID-19 research, and in general, domain adaption of AMR for this task.
pdf
bib
Graph-to-Graph Meaning Representation Transformations for Human-Robot Dialogue
Mitchell Abrams
|
Claire Bonial
|
Lucia Donatelli
Proceedings of the Society for Computation in Linguistics 2020
pdf
bib
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events
Claire Bonial
|
Tommaso Caselli
|
Snigdha Chaturvedi
|
Elizabeth Clark
|
Ruihong Huang
|
Mohit Iyyer
|
Alejandro Jaimes
|
Heng Ji
|
Lara J. Martin
|
Ben Miller
|
Teruko Mitamura
|
Nanyun Peng
|
Joel Tetreault
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events
2019
pdf
bib
Abstract Meaning Representation for Human-Robot Dialogue
Claire N. Bonial
|
Lucia Donatelli
|
Jessica Ervin
|
Clare R. Voss
Proceedings of the Society for Computation in Linguistics (SCiL) 2019
pdf
bib
abs
Augmenting Abstract Meaning Representation for Human-Robot Dialogue
Claire Bonial
|
Lucia Donatelli
|
Stephanie M. Lukin
|
Stephen Tratz
|
Ron Artstein
|
David Traum
|
Clare Voss
Proceedings of the First International Workshop on Designing Meaning Representations
We detail refinements made to Abstract Meaning Representation (AMR) that make the representation more suitable for supporting a situated dialogue system, where a human remotely controls a robot for purposes of search and rescue and reconnaissance. We propose 36 augmented AMRs that capture speech acts, tense and aspect, and spatial information. This linguistic information is vital for representing important distinctions, for example whether the robot has moved, is moving, or will move. We evaluate two existing AMR parsers for their performance on dialogue data. We also outline a model for graph-to-graph conversion, in which output from AMR parsers is converted into our refined AMRs. The design scheme presented here, though task-specific, is extendable for broad coverage of speech acts using AMR in future task-independent work.
2018
pdf
bib
abs
Automatically Extracting Qualia Relations for the Rich Event Ontology
Ghazaleh Kazeminejad
|
Claire Bonial
|
Susan Windisch Brown
|
Martha Palmer
Proceedings of the 27th International Conference on Computational Linguistics
Commonsense, real-world knowledge about the events that entities or “things in the world” are typically involved in, as well as part-whole relationships, is valuable for allowing computational systems to draw everyday inferences about the world. Here, we focus on automatically extracting information about (1) the events that typically bring about certain entities (origins), (2) the events that are the typical functions of entities, and (3) part-whole relationships in entities. These correspond to the agentive, telic and constitutive qualia central to the Generative Lexicon. We describe our motivations and methods for extracting these qualia relations from the Suggested Upper Merged Ontology (SUMO) and show that human annotators overwhelmingly find the information extracted to be reasonable. Because ontologies provide a way of structuring this information and making it accessible to agents and computational systems generally, efforts are underway to incorporate the extracted information to an ontology hub of Natural Language Processing semantic role labeling resources, the Rich Event Ontology.
pdf
bib
Dialogue Structure Annotation for Multi-Floor Interaction
David Traum
|
Cassidy Henry
|
Stephanie Lukin
|
Ron Artstein
|
Felix Gervits
|
Kimberly Pollard
|
Claire Bonial
|
Su Lei
|
Clare Voss
|
Matthew Marge
|
Cory Hayes
|
Susan Hill
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
bib
Abstract Meaning Representation of Constructions: The More We Include, the Better the Representation
Claire Bonial
|
Bianca Badarau
|
Kira Griffitt
|
Ulf Hermjakob
|
Kevin Knight
|
Tim O’Gorman
|
Martha Palmer
|
Nathan Schneider
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
pdf
bib
Proceedings of the Workshop Events and Stories in the News 2018
Tommaso Caselli
|
Ben Miller
|
Marieke van Erp
|
Piek Vossen
|
Martha Palmer
|
Eduard Hovy
|
Teruko Mitamura
|
David Caswell
|
Susan W. Brown
|
Claire Bonial
Proceedings of the Workshop Events and Stories in the News 2018
pdf
bib
abs
Can You Spot the Semantic Predicate in this Video?
Christopher Reale
|
Claire Bonial
|
Heesung Kwon
|
Clare Voss
Proceedings of the Workshop Events and Stories in the News 2018
We propose a method to improve human activity recognition in video by leveraging semantic information about the target activities from an expert-defined linguistic resource, VerbNet. Our hypothesis is that activities that share similar event semantics, as defined by the semantic predicates of VerbNet, will be more likely to share some visual components. We use a deep convolutional neural network approach as a baseline and incorporate linguistic information from VerbNet through multi-task learning. We present results of experiments showing the added information has negligible impact on recognition performance. We discuss how this may be because the lexical semantic information defined by VerbNet is generally not visually salient given the video processing approach used here, and how we may handle this in future approaches.
pdf
bib
abs
Towards a Computational Lexicon for Moroccan Darija: Words, Idioms, and Constructions
Jamal Laoudi
|
Claire Bonial
|
Lucia Donatelli
|
Stephen Tratz
|
Clare Voss
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
In this paper, we explore the challenges of building a computational lexicon for Moroccan Darija (MD), an Arabic dialect spoken by over 32 million people worldwide but which only recently has begun appearing frequently in written form in social media. We raise the question of what belongs in such a lexicon and start by describing our work building traditional word-level lexicon entries with their English translations. We then discuss challenges in translating idiomatic MD text that led to creating multi-word expression lexicon entries whose meanings could not be fully derived from the individual words. Finally, we provide a preliminary exploration of constructions to be considered for inclusion in an MD constructicon by translating examples of English constructions and examining their MD counterparts.
pdf
bib
abs
Constructing an Annotated Corpus of Verbal MWEs for English
Abigail Walsh
|
Claire Bonial
|
Kristina Geeraert
|
John P. McCrae
|
Nathan Schneider
|
Clarissa Somers
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
This paper describes the construction and annotation of a corpus of verbal MWEs for English, as part of the PARSEME Shared Task 1.1 on automatic identification of verbal MWEs. The criteria for corpus selection, the categories of MWEs used, and the training process are discussed, along with the particular issues that led to revisions in edition 1.1 of the annotation guidelines. Finally, an overview of the characteristics of the final annotated corpus is presented, as well as some discussion on inter-annotator agreement.
pdf
bib
abs
Consequences and Factors of Stylistic Differences in Human-Robot Dialogue
Stephanie Lukin
|
Kimberly Pollard
|
Claire Bonial
|
Matthew Marge
|
Cassidy Henry
|
Ron Artstein
|
David Traum
|
Clare Voss
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue
This paper identifies stylistic differences in instruction-giving observed in a corpus of human-robot dialogue. Differences in verbosity and structure (i.e., single-intent vs. multi-intent instructions) arose naturally without restrictions or prior guidance on how users should speak with the robot. Different styles were found to produce different rates of miscommunication, and correlations were found between style differences and individual user variation, trust, and interaction experience with the robot. Understanding potential consequences and factors that influence style can inform design of dialogue systems that are robust to natural variation from human users.
2017
pdf
bib
abs
The Rich Event Ontology
Susan Brown
|
Claire Bonial
|
Leo Obrst
|
Martha Palmer
Proceedings of the Events and Stories in the News Workshop
In this paper we describe a new lexical semantic resource, The Rich Event On-tology, which provides an independent conceptual backbone to unify existing semantic role labeling (SRL) schemas and augment them with event-to-event causal and temporal relations. By unifying the FrameNet, VerbNet, Automatic Content Extraction, and Rich Entities, Relations and Events resources, the ontology serves as a shared hub for the disparate annotation schemas and therefore enables the combination of SRL training data into a larger, more diverse corpus. By adding temporal and causal relational information not found in any of the independent resources, the ontology facilitates reasoning on and across documents, revealing relationships between events that come together in temporal and causal chains to build more complex scenarios. We envision the open resource serving as a valuable tool for both moving from the ontology to text to query for event types and scenarios of interest, and for moving from text to the ontology to access interpretations of events using the combined semantic information housed there.
pdf
bib
abs
Exploring Variation of Natural Human Commands to a Robot in a Collaborative Navigation Task
Matthew Marge
|
Claire Bonial
|
Ashley Foots
|
Cory Hayes
|
Cassidy Henry
|
Kimberly Pollard
|
Ron Artstein
|
Clare Voss
|
David Traum
Proceedings of the First Workshop on Language Grounding for Robotics
Robot-directed communication is variable, and may change based on human perception of robot capabilities. To collect training data for a dialogue system and to investigate possible communication changes over time, we developed a Wizard-of-Oz study that (a) simulates a robot’s limited understanding, and (b) collects dialogues where human participants build a progressively better mental model of the robot’s understanding. With ten participants, we collected ten hours of human-robot dialogue. We analyzed the structure of instructions that participants gave to a remote robot before it responded. Our findings show a general initial preference for including metric information (e.g., move forward 3 feet) over landmarks (e.g., move to the desk) in motion commands, but this decreased over time, suggesting changes in perception.
2016
pdf
bib
Multimodal Use of an Upper-Level Event Ontology
Claire Bonial
|
David Tahmoush
|
Susan Windisch Brown
|
Martha Palmer
Proceedings of the Fourth Workshop on Events
pdf
bib
abs
Comprehensive and Consistent PropBank Light Verb Annotation
Claire Bonial
|
Martha Palmer
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Recent efforts have focused on expanding the annotation coverage of PropBank from verb relations to adjective and noun relations, as well as light verb constructions (e.g., make an offer, take a bath). While each new relation type has presented unique annotation challenges, ensuring consistent and comprehensive annotation of light verb constructions has proved particularly challenging, given that light verb constructions are semi-productive, difficult to define, and there are often borderline cases. This research describes the iterative process of developing PropBank annotation guidelines for light verb constructions, the current guidelines, and a comparison to related resources.
2014
pdf
bib
abs
PropBank: Semantics of New Predicate Types
Claire Bonial
|
Julia Bonn
|
Kathryn Conger
|
Jena D. Hwang
|
Martha Palmer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This research focuses on expanding PropBank, a corpus annotated with predicate argument structures, with new predicate types; namely, noun, adjective and complex predicates, such as Light Verb Constructions. This effort is in part inspired by a sister project to PropBank, the Abstract Meaning Representation project, which also attempts to capture who is doing what to whom in a sentence, but does so in a way that abstracts away from syntactic structures. For example, alternate realizations of a ‘destroying’ event in the form of either the verb ‘destroy’ or the noun ‘destruction’ would receive the same Abstract Meaning Representation. In order for PropBank to reach the same level of coverage and continue to serve as the bedrock for Abstract Meaning Representation, predicate types other than verbs, which have previously gone without annotation, must be annotated. This research describes the challenges therein, including the development of new annotation practices that walk the line between abstracting away from language-particular syntactic facts to explore deeper semantics, and maintaining the connection between semantics and syntactic structures that has proven to be very valuable for PropBank as a corpus of training data for Natural Language Processing applications.
pdf
bib
An Approach to Take Multi-Word Expressions
Claire Bonial
|
Meredith Green
|
Jenette Preciado
|
Martha Palmer
Proceedings of the 10th Workshop on Multiword Expressions (MWE)
pdf
bib
SemLink+: FrameNet, VerbNet and Event Ontologies
Martha Palmer
|
Claire Bonial
|
Diana McCarthy
Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929-2014)
pdf
bib
The VerbCorner Project: Findings from Phase 1 of crowd-sourcing a semantic decomposition of verbs
Joshua K. Hartshorne
|
Claire Bonial
|
Martha Palmer
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
2013
pdf
bib
The VerbCorner Project: Toward an Empirically-Based Semantic Decomposition of Verbs
Joshua K. Hartshorne
|
Claire Bonial
|
Martha Palmer
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
pdf
bib
Abstract Meaning Representation for Sembanking
Laura Banarescu
|
Claire Bonial
|
Shu Cai
|
Madalina Georgescu
|
Kira Griffitt
|
Ulf Hermjakob
|
Kevin Knight
|
Philipp Koehn
|
Martha Palmer
|
Nathan Schneider
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse
pdf
bib
Expanding VerbNet with Sketch Engine
Claire Bonial
|
Orin Hargraves
|
Martha Palmer
Proceedings of the 6th International Conference on Generative Approaches to the Lexicon (GL2013)
pdf
bib
Renewing and Revising SemLink
Claire Bonial
|
Kevin Stowe
|
Martha Palmer
Proceedings of the 2nd Workshop on Linked Data in Linguistics (LDL-2013): Representing and linking lexicons, terminologies and other language data
2011
pdf
bib
Incorporating Coercive Constructions into a Verb Lexicon
Claire Bonial
|
Susan Windisch Brown
|
Jena D. Hwang
|
Christopher Parisien
|
Martha Palmer
|
Suzanne Stevenson
Proceedings of the ACL 2011 Workshop on Relational Models of Semantics
2010
pdf
bib
Multilingual Propbank Annotation Tools: Cornerstone and Jubilee
Jinho Choi
|
Claire Bonial
|
Martha Palmer
Proceedings of the NAACL HLT 2010 Demonstration Session
pdf
bib
PropBank Annotation of Multilingual Light Verb Constructions
Jena D. Hwang
|
Archna Bhatia
|
Claire Bonial
|
Aous Mansouri
|
Ashwini Vaidya
|
Nianwen Xue
|
Martha Palmer
Proceedings of the Fourth Linguistic Annotation Workshop
pdf
bib
abs
Propbank Frameset Annotation Guidelines Using a Dedicated Editor, Cornerstone
Jinho D. Choi
|
Claire Bonial
|
Martha Palmer
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
This paper gives guidelines of how to create and update Propbank frameset files using a dedicated editor, Cornerstone. Propbank is a corpus in which the arguments of each verb predicate are annotated with their semantic roles in relation to the predicate. Propbank annotation also requires the choice of a sense ID for each predicate. Thus, for each predicate in Propbank, there exists a corresponding frameset file showing the expected predicate argument structure of each sense related to the predicate. Since most Propbank annotations are based on the predicate argument structure defined in the frameset files, it is important to keep the files consistent, simple to read as well as easy to update. The frameset files are written in XML, which can be difficult to edit when using a simple text editor. Therefore, it is helpful to develop a user-friendly editor such as Cornerstone, specifically customized to create and edit frameset files. Cornerstone runs platform independently, is light enough to run as an X11 application and supports multiple languages such as Arabic, Chinese, English, Hindi and Korean.
pdf
bib
abs
Propbank Instance Annotation Guidelines Using a Dedicated Editor, Jubilee
Jinho D. Choi
|
Claire Bonial
|
Martha Palmer
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
This paper gives guidelines of how to annotate Propbank instances using a dedicated editor, Jubilee. Propbank is a corpus in which the arguments of each verb predicate are annotated with their semantic roles in relation to the predicate. Propbank annotation also requires the choice of a sense ID for each predicate. Jubilee facilitates this annotation process by displaying several resources of syntactic and semantic information simultaneously: the syntactic structure of a sentence is displayed in the main frame, the available senses with their corresponding argument structures are displayed in another frame, all available Propbank arguments are displayed for the annotators choice, and example annotations of each sense of the predicate are available to the annotator for viewing. Easy access to each of these resources allows the annotator to quickly absorb and apply the necessary syntactic and semantic information pertinent to each predicate for consistent and efficient annotation. Jubilee has been successfully adapted to many Propbank projects in several universities. The tool runs platform independently, is light enough to run as an X11 application and supports multiple languages such as Arabic, Chinese, English, Hindi and Korean.