Division of Example Sentences Based on the Meaning of a Target Word Using Semi-Supervised Clustering

Hiroyuki Shinnou, Minoru Sasaki


Abstract
In this paper, we describe a system that divides example sentences (data set) into clusters, based on the meaning of the target word, using a semi-supervised clustering technique. In this task, the estimation of the cluster number (the number of the meaning) is critical. Our system primarily concentrates on this aspect. First, a user assigns the system an initial cluster number for the target word. The system then performs general clustering on the data set to obtain small clusters. Next, using constraints given by the user, the system integrates these clusters to obtain the final clustering result. Our system performs this entire procedure with high precision and requiring only a few constraints. In the experiment, we tested the system for 12 Japanese nouns used in the SENSEVAL2 Japanese dictionary task. The experiment proved the effectiveness of our system. In the future, we will improve sentence similarity measurements.
Anthology ID:
L08-1339
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/301_paper.pdf
DOI:
Bibkey:
Cite (ACL):
Hiroyuki Shinnou and Minoru Sasaki. 2008. Division of Example Sentences Based on the Meaning of a Target Word Using Semi-Supervised Clustering. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Division of Example Sentences Based on the Meaning of a Target Word Using Semi-Supervised Clustering (Shinnou & Sasaki, LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/301_paper.pdf