Constructing a Named Entity Ontology from Web Corpora

Ming-Shun Lin, Hsin-Hsi Chen


Abstract
This paper proposes a named entity (NE) ontology generation engine, called XNE-Tree engine, which produces relational named entities by given a seed. The engine incrementally extracts high co-occurring named entities with the seed by using a common search engine. In each iterative step, the seed will be replaced by its siblings or descendants, which form new seeds. In this way, XNE-Tree engine will build a tree structure with the original seed as a root incrementally. Two seeds, Chinese transliteration names of Nicole Kidman (a famous actress) and Ernest Hemingway (a famous writer), are experimented to evaluate the performance of the XNE-Tree.¡@¡@For test the applicability of the ontology, we employ it to a phoneme-character conversion system, which convert input phoneme syllable sequences to text strings. Total 100 Chinese transliteration names, including 50 person names and 50 location names are used as test data. We derive an ontology composed of 7,642 named entities. The results of phoneme-character conversion show that both the recall rate and the MRR are improved from 0.79 and 0.50 to 0.84 to 0.55, respectively.
Anthology ID:
L06-1061
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Editors:
Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/118_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Ming-Shun Lin and Hsin-Hsi Chen. 2006. Constructing a Named Entity Ontology from Web Corpora. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Constructing a Named Entity Ontology from Web Corpora (Lin & Chen, LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/118_pdf.pdf