Leo Võhandu

Also published as: Leo Vohandu


2016

Many new wordnets in the world are constantly created and most take the original Princeton WordNet (PWN) as their starting point. This arguably central position imposes a responsibility on PWN to ensure that its structure is clean and consistent. To validate PWN hierarchical structures we propose the application of a system of test patterns. In this paper, we report on how to validate the PWN hierarchies using the system of test patterns. In sum, test patterns provide lexicographers with a very powerful tool, which we hope will be adopted by the global wordnet community.
New concepts and semantic relations are constantly added to Estonian Wordnet (EstWN) to increase its size. In addition to this, with the use of test patterns, the validation of EstWN hierarchies is also performed. This parallel work was carried out over the past four years (2011-2014) with 10 different EstWN versions (60-70). This has been a collaboration between the creators of test patterns and the lexicographers currently working on EstWN. This paper describes the usage of test patterns from the points of views of information scientists (the creators of test patterns) as well as the users (lexicographers). Using EstWN as an example, we illustrate how the continuous use of test patterns has led to significant improvement of the semantic hierarchies in EstWN.

2014

This paper introduces a test-pattern named a dense component for checking inconsistencies in the hierarchical structure of a wordnet. Dense component (viewed as substructure) points out the cases of regular polysemy in the context of multiple inheritance. Definition of the regular polysemy is redefined ― instead of lexical units there are used lexical concepts (synsets). All dense components are evaluated by expert lexicographer. Based on this experiment we give an overview of the inconsistencies which the test-pattern helps to detect. Special attention is turned to all different kind of corrections made by lexicographer. Authors of this paper find that the greatest benefit of the use of dense components is helping to detect if the regular polysemy is justified or not. In-depth analysis has been performed for Estonian Wordnet Version 66. Some comparative figures are also given for the Estonian Wordnet (EstWN) Version 67 and Princeton WordNet (PrWN) Version 3.1. Analysing hierarchies only hypernym-relations are used.

2012