Yves Peirsman

2013

Semantic similarity is a key issue in many computational tasks. This paper goes into the development and evaluation of two common ways of automatically calculating the semantic similarity between two words. On the one hand, such methods may depend on a manually constructed thesaurus like (Euro)WordNet. Their performance is often evaluated on the basis of a very restricted set of human similarity ratings. On the other hand, corpus-based methods rely on the distribution of two words in a corpus to determine their similarity. Their performance is generally quantified through a comparison with the judgements of the first type of approach. This paper introduces a new Gold Standard of more than 5,000 human intra-category similarity judgements. We show that corpus-based methods often outperform (Euro)WordNet on this data set, and that the use of the latter as a Gold Standard for the former, is thus often far from ideal.

pdf bib abs

Modelling Word Similarity: an Evaluation of Automatic Synonymy Extraction Algorithms.
Kris Heylen | Yves Peirsman | Dirk Geeraerts | Dirk Speelman
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Vector-based models of lexical semantics retrieve semantically related words automatically from large corpora by exploiting the property that words with a similar meaning tend to occur in similar contexts. Despite their increasing popularity, it is unclear which kind of semantic similarity they actually capture and for which kind of words. In this paper, we use three vector-based models to retrieve semantically related words for a set of Dutch nouns and we analyse whether three linguistic properties of the nouns influence the results. In particular, we compare results from a dependency-based model with those from a 1st and 2nd order bag-of-words model and we examine the effect of the nouns frequency, semantic speficity and semantic class. We find that all three models find more synonyms for high-frequency nouns and those belonging to abstract semantic classses. Semantic specificty does not have a clear influence.

2006

pdf bib abs

Unsupervised approaches to metonymy recognition
Yves Peirsman
Actes de la 13ème conférence sur le Traitement Automatique des Langues Naturelles. REncontres jeunes Chercheurs en Informatique pour le Traitement Automatique des Langues

To this day, the automatic recognition of metonymies has generally been addressed with supervised approaches. However, these require the annotation of a large number of training instances and hence, hinder the development of a wide-scale metonymy recognition system. This paper investigates if this knowledge acquisition bottleneck in metonymy recognition can be resolved by the application of unsupervised learning. Although the investigated technique, Schütze’s (1998) algorithm, enjoys considerable popularity in Word Sense Disambiguation, I will show that it is not yet robust enough to tackle the specific case of metonymy recognition. In particular, I will study the influence on its performance of four variables—the type of data set, the size of the context window, the application of SVD and the type of feature selection.

pdf bib

Example-Based Metonymy Recognition for Proper Nouns
Yves Peirsman
Student Research Workshop

pdf bib

Whats in a Name? The Automatic Recognition of Metonymical Location Names
Yves Peirsman
Proceedings of the Workshop on Making Sense of Sense: Bringing Psycholinguistics and Computational Linguistics Together