SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks

Elias Iosif, Alexandros Potamianos


Abstract
We investigate the creation of corpora from web-harvested data following a scalable approach that has linear query complexity. Individual web queries are posed for a lexicon that includes thousands of nouns and the retrieved data are aggregated. A lexical network is constructed, in which the lexicon nouns are linked according to their context-based similarity. We introduce the notion of semantic neighborhoods, which are exploited for the computation of semantic similarity. Two types of normalization are proposed and evaluated on the semantic tasks of: (i) similarity judgement, and (ii) noun categorization and taxonomy creation. The created corpus along with a set of tools and noun similarities are made publicly available.
Anthology ID:
L12-1247
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3499–3504
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/464_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Elias Iosif and Alexandros Potamianos. 2012. SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3499–3504, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks (Iosif & Potamianos, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/464_Paper.pdf