2024
pdf
bib
abs
MoCCA: A Model of Comparative Concepts for Aligning Constructicons
Arthur Lorenzi
|
Peter Ljunglöf
|
Ben Lyngfelt
|
Tiago Timponi Torrent
|
William Croft
|
Alexander Ziem
|
Nina Böbel
|
Linnéa Bäckström
|
Peter Uhrig
|
Ely E. Matos
Proceedings of the 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation @ LREC-COLING 2024
This paper presents MoCCA, a Model of Comparative Concepts for Aligning Constructicons under development by a consortium of research groups building Constructicons of different languages including Brazilian Portuguese, English, German and Swedish. The Constructicons will be aligned by using comparative concepts (CCs) providing language-neutral definitions of linguistic properties. The CCs are drawn from typological research on grammatical categories and constructions, and from FrameNet frames, organized in a conceptual network. Language-specific constructions are linked to the CCs in accordance with general principles. MoCCA is organized into files of two types: a largely static CC Database file and multiple Linking files containing relations between constructions in a Constructicon and the CCs. Tools are planned to facilitate visualization of the CC network and linking of constructions to the CCs. All files and guidelines will be versioned, and a mechanism is set up to report cases where a language-specific construction cannot be easily linked to existing CCs.
pdf
bib
abs
Building a Broad Infrastructure for Uniform Meaning Representations
Julia Bonn
|
Matthew J. Buchholz
|
Jayeol Chun
|
Andrew Cowell
|
William Croft
|
Lukas Denk
|
Sijia Ge
|
Jan Hajič
|
Kenneth Lai
|
James H. Martin
|
Skatje Myers
|
Alexis Palmer
|
Martha Palmer
|
Claire Benet Post
|
James Pustejovsky
|
Kristine Stenzel
|
Haibo Sun
|
Zdeňka Urešová
|
Rosa Vallejos
|
Jens E. L. Van Gysel
|
Meagan Vigus
|
Nianwen Xue
|
Jin Zhao
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
This paper reports the first release of the UMR (Uniform Meaning Representation) data set. UMR is a graph-based meaning representation formalism consisting of a sentence-level graph and a document-level graph. The sentence-level graph represents predicate-argument structures, named entities, word senses, aspectuality of events, as well as person and number information for entities. The document-level graph represents coreferential, temporal, and modal relations that go beyond sentence boundaries. UMR is designed to capture the commonalities and variations across languages and this is done through the use of a common set of abstract concepts, relations, and attributes as well as concrete concepts derived from words from invidual languages. This UMR release includes annotations for six languages (Arapaho, Chinese, English, Kukama, Navajo, Sanapana) that vary greatly in terms of their linguistic properties and resource availability. We also describe on-going efforts to enlarge this data set and extend it to other genres and modalities. We also briefly describe the available infrastructure (UMR annotation guidelines and tools) that others can use to create similar data sets.
pdf
bib
abs
UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies
Leonie Weissweiler
|
Nina Böbel
|
Kirian Guiller
|
Santiago Herrera
|
Wesley Scivetti
|
Arthur Lorenzi
|
Nurit Melnik
|
Archna Bhatia
|
Hinrich Schütze
|
Lori Levin
|
Amir Zeldes
|
Joakim Nivre
|
William Croft
|
Nathan Schneider
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
The Universal Dependencies (UD) project has created an invaluable collection of treebanks with contributions in over 140 languages. However, the UD annotations do not tell the full story. Grammatical constructions that convey meaning through a particular combination of several morphosyntactic elements—for example, interrogative sentences with special markers and/or word orders—are not labeled holistically. We argue for (i) augmenting UD annotations with a ‘UCxn’ annotation layer for such meaning-bearing grammatical constructions, and (ii) approaching this in a typologically informed way so that morphosyntactic strategies can be compared across languages. As a case study, we consider five construction families in ten languages, identifying instances of each construction in UD treebanks through the use of morphosyntactic patterns. In addition to findings regarding these particular constructions, our study yields important insights on methodology for describing and identifying constructions in language-general and language-particular ways, and lays the foundation for future constructional enrichment of UD treebanks.
2023
pdf
bib
abs
Mapping AMR to UMR: Resources for Adapting Existing Corpora for Cross-Lingual Compatibility
Julia Bonn
|
Skatje Myers
|
Jens E. L. Van Gysel
|
Lukas Denk
|
Meagan Vigus
|
Jin Zhao
|
Andrew Cowell
|
William Croft
|
Jan Hajič
|
James H. Martin
|
Alexis Palmer
|
Martha Palmer
|
James Pustejovsky
|
Zdenka Urešová
|
Rosa Vallejos
|
Nianwen Xue
Proceedings of the 21st International Workshop on Treebanks and Linguistic Theories (TLT, GURT/SyntaxFest 2023)
This paper presents detailed mappings between the structures used in Abstract Meaning Representation (AMR) and those used in Uniform Meaning Representation (UMR). These structures include general semantic roles, rolesets, and concepts that are largely shared between AMR and UMR, but with crucial differences. While UMR annotation of new low-resource languages is ongoing, AMR-annotated corpora already exist for many languages, and these AMR corpora are ripe for conversion to UMR format. Rather than focusing on semantic coverage that is new to UMR (which will likely need to be dealt with manually), this paper serves as a resource (with illustrated mappings) for users looking to understand the fine-grained adjustments that have been made to the representation techniques for semantic categoriespresent in both AMR and UMR.
2021
pdf
bib
abs
Theoretical and Practical Issues in the Semantic Annotation of Four Indigenous Languages
Jens E. L. Van Gysel
|
Meagan Vigus
|
Lukas Denk
|
Andrew Cowell
|
Rosa Vallejos
|
Tim O’Gorman
|
William Croft
Proceedings of the Joint 15th Linguistic Annotation Workshop (LAW) and 3rd Designing Meaning Representations (DMR) Workshop
Computational resources such as semantically annotated corpora can play an important role in enabling speakers of indigenous minority languages to participate in government, education, and other domains of public life in their own language. However, many languages – mainly those with small native speaker populations and without written traditions – have little to no digital support. One hurdle in creating such resources is that for many languages, few speakers would be capable of annotating texts – a task which requires literacy and some linguistic training – and that these experts’ time is typically in high demand for language planning work. This paper assesses whether typologically trained non-speakers of an indigenous language can feasibly perform semantic annotation using Uniform Meaning Representations, thus allowing for the creation of computational materials without putting further strain on community resources.
2020
pdf
bib
Proceedings of the Second International Workshop on Designing Meaning Representations
Nianwen Xue
|
Johan Bos
|
William Croft
|
Jan Hajič
|
Chu-Ren Huang
|
Stephan Oepen
|
Martha Palmer
|
James Pustejovsky
Proceedings of the Second International Workshop on Designing Meaning Representations
pdf
bib
abs
Cross-lingual annotation: a road map for low- and no-resource languages
Meagan Vigus
|
Jens E. L. Van Gysel
|
Tim O’Gorman
|
Andrew Cowell
|
Rosa Vallejos
|
William Croft
Proceedings of the Second International Workshop on Designing Meaning Representations
This paper presents a “road map” for the annotation of semantic categories in typologically diverse languages, with potentially few linguistic resources, and often no existing computational resources. Past semantic annotation efforts have focused largely on high-resource languages, or relatively low-resource languages with a large number of native speakers. However, there are certain typological traits, namely the synthesis of multiple concepts into a single word, that are more common in languages with a smaller speech community. For example, what is expressed as a sentence in a more analytic language like English, may be expressed as a single word in a more synthetic language like Arapaho. This paper proposes solutions for annotating analytic and synthetic languages in a comparable way based on existing typological research, and introduces a road map for the annotation of languages with a dearth of resources.
pdf
bib
abs
Representing constructional metaphors
Pavlina Kalm
|
Michael Regan
|
Sook-kyung Lee
|
Chris Peverada
|
William Croft
Proceedings of the Second International Workshop on Designing Meaning Representations
This paper introduces a representation and annotation scheme for argument structure constructions that are used metaphorically with verbs in different semantic domains. We aim to contribute to the study of constructional metaphors which has received little attention in theoretical and computational linguistics. The proposed representation consists of a systematic mapping between the constructional and verbal event structures in two domains. It reveals the semantic motivations that lead to constructions being metaphorically extended. We demonstrate this representation on argument structure constructions with Transfer of Possession verbs and test the viability of this scheme with an annotation exercise.
2019
pdf
bib
Proceedings of the First International Workshop on Designing Meaning Representations
Nianwen Xue
|
William Croft
|
Jan Hajic
|
Chu-Ren Huang
|
Stephan Oepen
|
Martha Palmer
|
James Pustejovksy
Proceedings of the First International Workshop on Designing Meaning Representations
pdf
bib
abs
Cross-Linguistic Semantic Annotation: Reconciling the Language-Specific and the Universal
Jens E. L. Van Gysel
|
Meagan Vigus
|
Pavlina Kalm
|
Sook-kyung Lee
|
Michael Regan
|
William Croft
Proceedings of the First International Workshop on Designing Meaning Representations
Developers of cross-linguistic semantic annotation schemes face a number of issues not encountered in monolingual annotation. This paper discusses four such issues, related to the establishment of annotation labels, and the treatment of languages with more fine-grained, more coarse-grained, and cross-cutting categories. We propose that a lattice-like architecture of the annotation categories can adequately handle all four issues, and at the same time remain both intuitive for annotators and faithful to typological insights. This position is supported by a brief annotation experiment.
pdf
bib
abs
Event Structure Representation: Between Verbs and Argument Structure Constructions
Pavlina Kalm
|
Michael Regan
|
William Croft
Proceedings of the First International Workshop on Designing Meaning Representations
This paper proposes a novel representation of event structure by separating verbal semantics and the meaning of argument structure constructions that verbs occur in. Our model demonstrates how the two meaning representations interact. Our model thus effectively deals with various verb construals in different argument structure constructions, unlike purely verb-based approaches. However, unlike many constructionally-based approaches, we also provide a richer representation of the event structure evoked by the verb meaning.
pdf
bib
abs
A Dependency Structure Annotation for Modality
Meagan Vigus
|
Jens E. L. Van Gysel
|
William Croft
Proceedings of the First International Workshop on Designing Meaning Representations
This paper presents an annotation scheme for modality that employs a dependency structure. Events and sources (here, conceivers) are represented as nodes and epistemic strength relations characterize the edges. The epistemic strength values are largely based on Saurí and Pustejovsky’s (2009) FactBank, while the dependency structure mirrors Zhang and Xue’s (2018b) approach to temporal relations. Six documents containing 377 events have been annotated by two expert annotators with high levels of agreement.
2018
pdf
bib
abs
A Rich Annotation Scheme for Mental Events
William Croft
|
Pavlína Pešková
|
Michael Regan
|
Sook-kyung Lee
Proceedings of the Workshop Events and Stories in the News 2018
We present a rich annotation scheme for the structure of mental events. Mental events are those in which the verb describes a mental state or process, usually oriented towards an external situation. While physical events have been described in detail and there are numerous studies of their semantic analysis and annotation, mental events are less thoroughly studied. The annotation scheme proposed here is based on decompositional analyses in the semantic and typological linguistic literature. The scheme was applied to the news corpus from the 2016 Events workshop, and error analysis of the test annotation provides suggestions for refinement and clarification of the annotation scheme.
pdf
bib
abs
Annotation of Tense and Aspect Semantics for Sentential AMR
Lucia Donatelli
|
Michael Regan
|
William Croft
|
Nathan Schneider
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
Although English grammar encodes a number of semantic contrasts with tense and aspect marking, these semantics are currently ignored by Abstract Meaning Representation (AMR) annotations. This paper extends sentence-level AMR to include a coarse-grained treatment of tense and aspect semantics. The proposed framework augments the representation of finite predications to include a four-way temporal distinction (event time before, up to, at, or after speech time) and several aspectual distinctions (including static vs. dynamic, habitual vs. episodic, and telic vs. atelic). This will enable AMR to be used for NLP tasks and applications that require sophisticated reasoning about time and event structure.
2017
pdf
bib
abs
Integrating Decompositional Event Structures into Storylines
William Croft
|
Pavlína Pešková
|
Michael Regan
Proceedings of the Events and Stories in the News Workshop
Storyline research links together events in stories and specifies shared participants in those stories. In these analyses, an atomic event is assumed to be a single clause headed by a single verb. However, many analyses of verbal semantics assume a decompositional analysis of events expressed in single clauses. We present a formalization of a decompositional analysis of events in which each participant in a clausal event has their own temporally extended subevent, and the subevents are related through causal and other interactions. This decomposition allows us to represent storylines as an evolving set of interactions between participants over time.
2016
pdf
bib
Annotation of causal and aspectual structure of events in RED: a preliminary report
William Croft
|
Pavlina Pešková
|
Michael Regan
Proceedings of the Fourth Workshop on Events
1987
pdf
bib
Commonsense Metaphysics and Lexical Semantics
Jerry R. Hobbs
|
William Croft
|
Todd Davies
|
Douglas Edwards
|
Kenneth Laws
Computational Linguistics, Formerly the American Journal of Computational Linguistics, Volume 13, Numbers 3-4, July-December 1987
1986
pdf
bib
Commonsense Metaphysics and Lexical Semantics
Jerry R. Hobbs
|
William Croft
|
Todd Davies
|
Douglas Edwards
|
Kenneth Laws
Strategic Computing - Natural Language Workshop: Proceedings of a Workshop Held at Marina del Rey, California, May 1-2, 1986
pdf
bib
Commonsense Metaphysics and Lexical Semantics
Jerry R. Hobbs
|
William Croft
|
Todd Davies
|
Douglas Edwards
|
Kenneth Laws
24th Annual Meeting of the Association for Computational Linguistics