Roberto Zanoli


2014

pdf bib
The Excitement Open Platform for Textual Inferences
Bernardo Magnini | Roberto Zanoli | Ido Dagan | Kathrin Eichler | Guenter Neumann | Tae-Gil Noh | Sebastian Pado | Asher Stern | Omer Levy
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

2012

pdf bib
The KnowledgeStore: an Entity-Based Storage System
Roldano Cattoni | Francesco Corcoglioniti | Christian Girardi | Bernardo Magnini | Luciano Serafini | Roberto Zanoli
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper describes the KnowledgeStore, a large-scale infrastructure for the combined storage and interlinking of multimedia resources and ontological knowledge. Information in the KnowledgeStore is organized around entities, such as persons, organizations and locations. The system allows (i) to import background knowledge about entities, in form of annotated RDF triples; (ii) to associate resources to entities by automatically recognizing, coreferring and linking mentions of named entities; and (iii) to derive new entities based on knowledge extracted from mentions. The KnowledgeStore builds on state of art technologies for language processing, including document tagging, named entity extraction and cross-document coreference. Its design provides for a tight integration of linguistic and semantic features, and eases the further processing of information by explicitly representing the contexts where knowledge and mentions are valid or relevant. We describe the system and report about the creation of a large-scale KnowledgeStore instance for storing and integrating multimedia contents and background knowledge relevant to the Italian Trentino region.

2010

pdf bib
Entity Mention Detection using a Combination of Redundancy-Driven Classifiers
Silvana Marianela Bernaola Biggio | Manuela Speranza | Roberto Zanoli
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We present an experimental framework for Entity Mention Detection in which two different classifiers are combined to exploit Data Redundancy attained through the annotation of a large text corpus, as well as a number of Patterns extracted automatically from the same corpus. In order to recognize proper name, nominal, and pronominal mentions we not only exploit the information given by mentions recognized within the corpus being annotated, but also given by mentions occurring in an external and unannotated corpus. The system was first evaluated in the Evalita 2009 evaluation campaign obtaining good results. The current version is being used in a number of applications: on the one hand, it is used in the LiveMemories project, which aims at scaling up content extraction techniques towards very large scale extraction from multimedia sources. On the other hand, it is used to annotate corpora, such as Italian Wikipedia, thus providing easy access to syntactic and semantic annotation for both the Natural Language Processing and Information Retrieval communities. Moreover a web service version of the system is available and the system is going to be integrated into the TextPro suite of NLP tools.

pdf bib
BART: A Multilingual Anaphora Resolution System
Samuel Broscheit | Massimo Poesio | Simone Paolo Ponzetto | Kepa Joseba Rodriguez | Lorenza Romano | Olga Uryupina | Yannick Versley | Roberto Zanoli
Proceedings of the 5th International Workshop on Semantic Evaluation

2008

pdf bib
The TextPro Tool Suite
Emanuele Pianta | Christian Girardi | Roberto Zanoli
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present TextPro, a suite of modular Natural Language Processing (NLP) tools for analysis of Italian and English texts. The suite has been designed so as to integrate and reuse state of the art NLP components developed by researchers at FBK. The current version of the tool suite provides functions ranging from tokenization to chunking and Named Entity Recognition (NER). The system’s architecture is organized as a pipeline of processors wherein each stage accepts data from an initial input or from an output of a previous stage, executes a specific task, and sends the resulting data to the next stage, or to the output of the pipeline. TextPro performed the best on the task of Italian NER and Italian PoS Tagging at EVALITA 2007. When tested on a number of other standard English benchmarks, TextPro confirms that it performs as state of the art system. Distributions for Linux, Solaris and Windows are available, for both research and commercial purposes. A web-service version of the system is under development.