A New Method for Evaluating Automatically Learned Terminological Taxonomies

Paola Velardi, Roberto Navigli, Stefano Faralli, Juana Maria Ruiz Martinez


Abstract
Evaluating a taxonomy learned automatically against an existing gold standard is a very complex problem, because differences stem from the number, label, depth and ordering of the taxonomy nodes. In this paper we propose casting the problem as one of comparing two hierarchical clusters. To this end we defined a variation of the Fowlkes and Mallows measure (Fowlkes and Mallows, 1983). Our method assigns a similarity value B^i_(l,r) to the learned (l) and reference (r) taxonomy for each cut i of the corresponding anonymised hierarchies, starting from the topmost nodes down to the leaf concepts. For each cut i, the two hierarchies can be seen as two clusterings C^i_l , C^i_r of the leaf concepts. We assign a prize to early similarity values, i.e. when concepts are clustered in a similar way down to the lowest taxonomy levels (close to the leaf nodes). We apply our method to the evaluation of the taxonomy learning methods put forward by Navigli et al. (2011) and Kozareva and Hovy (2010).
Anthology ID:
L12-1130
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1498–1504
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/295_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Paola Velardi, Roberto Navigli, Stefano Faralli, and Juana Maria Ruiz Martinez. 2012. A New Method for Evaluating Automatically Learned Terminological Taxonomies. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 1498–1504, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
A New Method for Evaluating Automatically Learned Terminological Taxonomies (Velardi et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/295_Paper.pdf