Semantic Parsing and Sense Tagging the Princeton WordNet Gloss Corpus

Alexandre Rademaker, Abhishek Basu, Rajkiran Veluri


Abstract
In 2008, the Princeton team released the last version of the “Princeton Annotated Gloss Corpus”. In this corpus. The word forms from the definitions and examples (glosses) of Princeton WordNet are manually linked to the context-appropriate sense in WordNet. However, the annotation was not complete, and the dataset was never officially released as part of WordNet 3.0, remaining as one of the standoff files available for download. Eleven years later, in 2019, one of the authors of this paper restarted the project aiming to complete the sense annotation of the approximately 200 thousand word forms not yet annotated. Here, we provide additional motivations to complete this dataset and report the progress in the work and evaluations. Intending to provide an extra level of consistency in the sense annotation and a deep semantic representation of the definitions and examples promoting WordNet from a lexical resource to a lightweight ontology, we now employ the English Resource Grammar (ERG), a broad-coverage HPSG grammar of English to parse the sentences and project the sense annotations from the surface words to the ERG predicates. We also report some initial steps on upgrading the corpus to WordNet 3.1 to facilitate mapping the data to other lexical resources.
Anthology ID:
2023.gwc-1.30
Volume:
Proceedings of the 12th Global Wordnet Conference
Month:
January
Year:
2023
Address:
University of the Basque Country, Donostia - San Sebastian, Basque Country
Editors:
German Rigau, Francis Bond, Alexandre Rademaker
Venue:
GWC
SIG:
Publisher:
Global Wordnet Association
Note:
Pages:
243–253
Language:
URL:
https://aclanthology.org/2023.gwc-1.30
DOI:
Bibkey:
Cite (ACL):
Alexandre Rademaker, Abhishek Basu, and Rajkiran Veluri. 2023. Semantic Parsing and Sense Tagging the Princeton WordNet Gloss Corpus. In Proceedings of the 12th Global Wordnet Conference, pages 243–253, University of the Basque Country, Donostia - San Sebastian, Basque Country. Global Wordnet Association.
Cite (Informal):
Semantic Parsing and Sense Tagging the Princeton WordNet Gloss Corpus (Rademaker et al., GWC 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.gwc-1.30.pdf