Alexandre Rademaker


2021

pdf bib
The GlobalWordNet Formats: Updates for 2020
John P. McCrae | Michael Wayne Goodman | Francis Bond | Alexandre Rademaker | Ewa Rudnicka | Luis Morgado Da Costa
Proceedings of the 11th Global Wordnet Conference

The Global Wordnet Formats have been introduced to enable wordnets to have a common representation that can be integrated through the Global WordNet Grid. As a result of their adoption, a number of shortcomings of the format were identified, and in this paper we describe the extensions to the formats that address these issues. These include: ordering of senses, dependencies between wordnets, pronunciation, syntactic modelling, relations, sense keys, metadata and RDF support. Furthermore, we provide some perspectives on how these changes help in the integration of wordnets.

pdf bib
A Universal Dependencies Corpora Maintenance Methodology Using Downstream Application
Ran Iwamoto | Hiroshi Kanayama | Alexandre Rademaker | Takuya Ohko
Proceedings of the Third Workshop on Computational Typology and Multilingual NLP

This paper investigates updates of Universal Dependencies (UD) treebanks in 23 languages and their impact on a downstream application. Numerous people are involved in updating UD’s annotation guidelines and treebanks in various languages. However, it is not easy to verify whether the updated resources maintain universality with other language resources. Thus, validity and consistency of multilingual corpora should be tested through application tasks involving syntactic structures with PoS tags, dependency labels, and universal features. We apply the syntactic parsers trained on UD treebanks from multiple versions (2.0 to 2.7) to a clause-level sentiment extractor. We then analyze the relationships between attachment scores of dependency parsers and performance in application tasks. For future UD developments, we show examples of outputs that differ depending on version.

2020

pdf bib
English WordNet 2020: Improving and Extending a WordNet for English using an Open-Source Methodology
John Philip McCrae | Alexandre Rademaker | Ewa Rudnicka | Francis Bond
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMW2020)

WordNet, while one of the most widely used resources for NLP, has not been updated for a long time, and as such a new project English WordNet has arisen to continue the development of the model under an open-source paradigm. In this paper, we detail the second release of this resource entitled “English WordNet 2020”. The work has focused firstly, on the introduction of new synsets and senses and developing guidelines for this and secondly, on the integration of contributions from other projects. We present the changes in this edition, which total over 15,000 changes over the previous release.

pdf bib
Inclusion of Lithological terms (rocks and minerals) in The Open Wordnet for English
Alexandre Tessarollo | Alexandre Rademaker
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMW2020)

We extend the Open WordNet for English (OWN-EN) with rock-related and other lithological terms using the authoritative source of GBA’s Thesaurus. Our aim is to improve WordNet to better function within Oil & Gas domain, particularly geoscience texts. We use a three step approach: a proof of concept-level extension of WordNet, a major extension on which we evaluate the impact with positive results and a full extension encompassing all GBA’s lithological terms. We also build a mapping to GBA which also links to several other resources: WikiData, British Geological Survey, Inspire, GeoSciML and DBpedia.

2019

pdf bib
Fast developing of a Natural Language Interface for a Portuguese WordNet: Leveraging on Sentence Embeddings
Hugo Gonçalo Oliveira | Alexandre Rademaker
Proceedings of the 10th Global Wordnet Conference

We describe how a natural language interface can be developed for a wordnet with a small set of handcrafted templates, leveraging on sentence embeddings. The proposed approach does not use rules for parsing natural language queries but experiments showed that the embeddings model is tolerant enough for correctly predicting relation types that do not match known patterns exactly. It was tested with OpenWordNet-PT, for which this method may provide an alternative interface, with benefits also on the curation process.

pdf bib
English WordNet 2019 – An Open-Source WordNet for English
John P. McCrae | Alexandre Rademaker | Francis Bond | Ewa Rudnicka | Christiane Fellbaum
Proceedings of the 10th Global Wordnet Conference

We describe the release of a new wordnet for English based on the Princeton WordNet, but now developed under an open-source model. In particular, this version of WordNet, which we call English WordNet 2019, which has been developed by multiple people around the world through GitHub, fixes many errors in previous wordnets for English. We give some details of the changes that have been made in this version and give some perspectives about likely future changes that will be made as this project continues to evolve.

pdf bib
Portuguese Manners of Speaking
Valeria de Paiva | Alexandre Rademaker
Proceedings of the 10th Global Wordnet Conference

Lexical resources need to be as complete as possible. Very little work seems to have been done on adverbs, the smallest part of speech class in Princeton WordNet counting the number of synsets. Amongst adverbs, manner adverbs ending in ‘-ly’ seem the easiest to work with, as their meaning is almost the same as the one of the associated adjective. This phenomenon seems to be parallel in English and Portuguese, where these manner adverbs finish in the suffix ‘-mente’. We use this correspondence to improve the coverage of adverbs in the lexical resource OpenWordNet-PT, a wordnet for Portuguese.

pdf bib
Completing the Princeton Annotated Gloss Corpus Project
Alexandre Rademaker | Bruno Cuconato | Alessandra Cid | Alexandre Tessarollo | Henrique Andrade
Proceedings of the 10th Global Wordnet Conference

In the Princeton WordNet Gloss Corpus, the word forms from the definitions (“glosses”) in WordNet’s synsets are manually linked to the context-appropriate sense in the WordNet. The glosses then become a sense-disambiguated corpus annotated against WordNet version 3.0. The result is also called a semantic concordance, which can be seen as both a lexicon (WordNet extension) and an annotated corpus. In this work we motivate and present the initial steps to complete the annotation of all open-class words in this corpus. Finally, we introduce a freely-available annotation interface built as an Emacs extension, and evaluate a preliminary annotation effort.

pdf bib
Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)
Alexandre Rademaker | Francis Tyers
Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019)

2018

pdf bib
Using OpenWordnet-PT for Question Answering on Legal Domain
Pedro Delfino | Bruno Cuconato | Guilherme Paulino-Passos | Gerson Zaverucha | Alexandre Rademaker
Proceedings of the 9th Global Wordnet Conference

In order to practice a legal profession in Brazil, law graduates must be approved in the OAB national unified bar exam. For their topic coverage and national reach, the OAB exams provide an excellent benchmark for the performance of legal information systems, as it provides objective metrics and are challenging even for humans, as only 20% of its candidates are approved. After constructing a new data set on the exams and doing shallow experiments on it, we now employ the OpenWordnet-PT to verify whether using word senses and relations we can improve previous results. We discuss the results, possible future ideas and the additions to the OpenWordnet-PT that we made.

pdf bib
Extending Wordnet to Geological Times
Henrique Muniz | Fabricio Chalub | Alexandre Rademaker | Valeria De Paiva
Proceedings of the 9th Global Wordnet Conference

This paper describes work extending Princeton WordNet to the domain of geological texts, associated with the time periods of the geological eras of the Earth History. We intend this extension to be considered as an example for any other domain extension that we might want to pursue. To provide this extension, we first produce a textual version of Princeton WordNet. Then we map a fragment of the International Commission on Stratigraphy (ICS) ontologies to WordNet and create the appropriate new synsets. We check the extended ontology on a small corpus of sentences from Gas and Oil technical reports and realize that more work needs to be done, as we need new words, new senses and new compounds in our extended WordNet.

pdf bib
Text Mining for History: first steps on building a large dataset
Suemi Higuchi | Cláudia Freitas | Bruno Cuconato | Alexandre Rademaker
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Universal Dependencies for Portuguese
Alexandre Rademaker | Fabricio Chalub | Livy Real | Cláudia Freitas | Eckhard Bick | Valeria de Paiva
Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017)

2016

pdf bib
An overview of Portuguese WordNets
Valeria de Paiva | Livy Real | Hugo Gonçalo Oliveira | Alexandre Rademaker | Cláudia Freitas | Alberto Simões
Proceedings of the 8th Global WordNet Conference (GWC)

Semantic relations between words are key to building systems that aim to understand and manipulate language. For English, the “de facto” standard for representing this kind of knowledge is Princeton’s WordNet. Here, we describe the wordnet-like resources currently available for Portuguese: their origins, methods of creation, sizes, and usage restrictions. We start tackling the problem of comparing them, but only in quantitative terms. Finally, we sketch ideas for potential collaboration between some of the projects that produce Portuguese wordnets.

pdf bib
Verifying Integrity Constraints of a RDF-based WordNet
Alexandre Rademaker | Fabricio Chalub
Proceedings of the 8th Global WordNet Conference (GWC)

This paper presents our first attempt at verifying integrity constraints of our openWordnet-PT against the ontology for Wordnets encoding. Our wordnet is distributed in Resource Description Format (RDF) and we want to guarantee not only the syntax correctness but also its semantics soundness.

pdf bib
Semantic Links for Portuguese
Fabricio Chalub | Livy Real | Alexandre Rademaker | Valeria de Paiva
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper describes work on incorporating Princenton’s WordNet morphosemantics links to the fabric of the Portuguese OpenWordNet-PT. Morphosemantic links are relations between verbs and derivationally related nouns that are semantically typed (such as for tune-tuner ― in Portuguese “afinar-afinador” – linked through an “agent” link). Morphosemantic links have been discussed for Princeton’s WordNet for a while, but have not been added to the official database. These links are very useful, they help us to improve our Portuguese WordNet. Thus we discuss the integration of these links in our base and the issues we encountered with the integration.

2015

pdf bib
HAREM and Klue: how to put two tagsets for named entities annotation together
Livy Real | Alexandre Rademaker
Proceedings of the Fifth Named Entity Workshop

pdf bib
Seeing is Correcting: curating lexical resources using social interfaces
Livy Real | Fabricio Chalub | Valeria de Paiva | Claudia Freitas | Alexandre Rademaker
Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications

pdf bib
Proceedings of the 10th Brazilian Symposium in Information and Human Language Technology
Claudia Freitas | Alexandre Rademaker
Proceedings of the 10th Brazilian Symposium in Information and Human Language Technology

pdf bib
Anotação de corpus com a OpenWordNet-PT: um exercício de desambiguação (Sense annotation with OpenWordNet-PT: an exercise of word sense disambiguation)
Cláudia Freitas | Livy Real | Alexandre Rademaker
Proceedings of the 10th Brazilian Symposium in Information and Human Language Technology

2014

pdf bib
Embedding NomLex-BR nominalizations into OpenWordnet-PT
Alexandre Rademaker | Valeria de Paiva | Gerard de Melo | Livy Maria Real Coelho
Proceedings of the Seventh Global Wordnet Conference

pdf bib
OpenWordNet-PT: A Project Report
Alexandre Rademaker | Valeria de Paiva | Gerard de Melo | Livy Real | Maira Gatti
Proceedings of the Seventh Global Wordnet Conference

pdf bib
NomLex-PT: A Lexicon of Portuguese Nominalizations
Valeria de Paiva | Livy Real | Alexandre Rademaker | Gerard de Melo
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents NomLex-PT, a lexical resource describing Portuguese nominalizations. NomLex-PT connects verbs to their nominalizations, thereby enabling NLP systems to observe the potential semantic relationships between the two words when analysing a text. NomLex-PT is freely available and encoded in RDF for easy integration with other resources. Most notably, we have integrated NomLex-PT with OpenWordNet-PT, an open Portuguese Wordnet.

2012

pdf bib
OpenWordNet-PT: An Open Brazilian Wordnet for Reasoning
Valeria de Paiva | Alexandre Rademaker | Gerard de Melo
Proceedings of COLING 2012: Demonstration Papers