2017
pdf
bib
Processo de construção de um corpus anotado com Entidades Geológicas visando REN (Building an annotated corpus with geological entities for NER)[In Portuguese]
Daniela Amaral
|
Sandra Collovini
|
Anny Figueira
|
Renata Vieira
|
Renata Vieira
|
Marco Gonzalez
Proceedings of the 11th Brazilian Symposium in Information and Human Language Technology
2016
pdf
bib
abs
Summ-it++: an Enriched Version of the Summ-it Corpus
Evandro Fonseca
|
André Antonitsch
|
Sandra Collovini
|
Daniela Amaral
|
Renata Vieira
|
Anny Figueira
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This paper presents Summ-it++, an enriched version the Summ-it corpus. In this new version, the corpus has received new semantic layers, named entity categories and relations between named entities, adding to the previous coreference annotation. In addition, we change the original Summ-it format to SemEval
2014
pdf
bib
abs
Comparative Analysis of Portuguese Named Entities Recognition Tools
Daniela Amaral
|
Evandro Fonseca
|
Lucelene Lopes
|
Renata Vieira
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper describes an experiment to compare four tools to recognize named entities in Portuguese texts. The experiment was made over the HAREM corpora, a golden standard for named entities recognition in Portuguese. The tools experimented are based on natural language processing techniques and also machine learning. Specifically, one of the tools is based on Conditional random fields, an unsupervised machine learning model that has being used to named entities recognition in several languages, while the other tools follow more traditional natural language approaches. The comparison results indicate advantages for different tools according to the different classes of named entities. Despite of such balance among tools, we conclude pointing out foreseeable advantages to the machine learning based tool.