Entity Mention Detection using a Combination of Redundancy-Driven Classifiers

Silvana Marianela Bernaola Biggio, Manuela Speranza, Roberto Zanoli


Abstract
We present an experimental framework for Entity Mention Detection in which two different classifiers are combined to exploit Data Redundancy attained through the annotation of a large text corpus, as well as a number of Patterns extracted automatically from the same corpus. In order to recognize proper name, nominal, and pronominal mentions we not only exploit the information given by mentions recognized within the corpus being annotated, but also given by mentions occurring in an external and unannotated corpus. The system was first evaluated in the Evalita 2009 evaluation campaign obtaining good results. The current version is being used in a number of applications: on the one hand, it is used in the LiveMemories project, which aims at scaling up content extraction techniques towards very large scale extraction from multimedia sources. On the other hand, it is used to annotate corpora, such as Italian Wikipedia, thus providing easy access to syntactic and semantic annotation for both the Natural Language Processing and Information Retrieval communities. Moreover a web service version of the system is available and the system is going to be integrated into the TextPro suite of NLP tools.
Anthology ID:
L10-1363
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Editors:
Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/530_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Silvana Marianela Bernaola Biggio, Manuela Speranza, and Roberto Zanoli. 2010. Entity Mention Detection using a Combination of Redundancy-Driven Classifiers. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Entity Mention Detection using a Combination of Redundancy-Driven Classifiers (Biggio et al., LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/530_Paper.pdf