This paper presents detailed mappings between the structures used in Abstract Meaning Representation (AMR) and those used in Uniform Meaning Representation (UMR). These structures include general semantic roles, rolesets, and concepts that are largely shared between AMR and UMR, but with crucial differences. While UMR annotation of new low-resource languages is ongoing, AMR-annotated corpora already exist for many languages, and these AMR corpora are ripe for conversion to UMR format. Rather than focusing on semantic coverage that is new to UMR (which will likely need to be dealt with manually), this paper serves as a resource (with illustrated mappings) for users looking to understand the fine-grained adjustments that have been made to the representation techniques for semantic categoriespresent in both AMR and UMR.
This paper discusses the challenges of annotating the predicate-argument structure of Chinese verb compounds in Uniform Meaning Representation (UMR), a recent meaning representation framework that extends Abstract Meaning Representation (AMR) to cross-linguistic settings. The key issue is to decide whether to annotate the argument structure of a verb compound as a whole, or to annotate the argument structure of their component verbs as well as the relations between them. We examine different types of Chinese verb compounds, and propose how to annotate them based on the principle of compositionality, level of grammaticalization, and productivity of component verbs. We propose a solution to the practical problem of having to define the semantic roles for Chinese verb compounds that are quite open-ended by separating compositional verb compounds from verb compounds that are non-compositional or have grammaticalized verb components. For compositional verb compounds, instead of annotating the argument structure of the verb compound as a whole, we annotate the argument structure of the component verbs as well as the semantic relations between them as creating an exhaustive list of such verb compounds is infeasible. Verb compounds with grammaticalized verb components also tend to be productive and we represent grammaticalized verb compounds as either attributes of the primary verb or as relations.
As the sources of information that we consume everyday rapidly diversify, it is becoming increasingly important to develop NLP tools that help to evaluate the credibility of the information we receive. A critical step towards this goal is to determine the factuality of events in text. In this paper, we frame factuality assessment as a modal dependency parsing task that identifies the events and their sources, formally known as conceivers, and then determine the level of certainty that the sources are asserting with respect to the events. We crowdsource the first large-scale data set annotated with modal dependency structures that consists of 353 Covid-19 related news articles, 24,016 events, and 2,938 conceivers. We also develop the first modal dependency parser that jointly extracts events, conceivers and constructs the modal dependency structure of a text. We evaluate the joint model against a pipeline model and demonstrate the advantage of the joint model in conceiver extraction and modal dependency structure construction when events and conceivers are automatically extracted. We believe the dataset and the models will be a valuable resource for a whole host of NLP applications such as fact checking and rumor detection.
We present UMR-Writer, a web-based application for annotating Uniform Meaning Representations (UMR), a graph-based, cross-linguistically applicable semantic representation developed recently to support the development of interpretable natural language applications that require deep semantic analysis of texts. We present the functionalities of UMR-Writer and discuss the challenges in developing such a tool and how they are addressed.