Ellen Souza
2026
UlyssesLegalNER-Br: from Legislative to Legal, a comprehensive corpus of Brazilian legal documents for Named Entity Recognition
Hidelberg O. Albuquerque | Ellen Souza | Danilo C. G. Lucena | Héldon J. O. Albuquerque | Nádia F. F. da Silva | Márcio de S. Dias | Rafael O. Nunes | Adriano L. I. Oliveira | André C. P. L. F. de Carvalho
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
Hidelberg O. Albuquerque | Ellen Souza | Danilo C. G. Lucena | Héldon J. O. Albuquerque | Nádia F. F. da Silva | Márcio de S. Dias | Rafael O. Nunes | Adriano L. I. Oliveira | André C. P. L. F. de Carvalho
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
The legal domain presents several challenges for Natural Language Processing (NLP), particularly due to its linguistic complexity and lack of public datasets. Named Entity Recognition (NER), a subarea of NLP, has been successfully used to extract useful knowledge from legal texts. Its widespread use is limited by the lack of legal text corpora. This paper introduces UlyssesLegalNER-Br, a comprehensive corpus of Brazilian legal documents for NER, covering bills, case laws and laws, including the first NER corpus based exclusively on Brazilian laws. This research expand the UlyssesNER-Br corpus, previously focused only on the Brazilian legislative domain. The proposed corpus has 560 public documents annotated using a hybrid approach, organized in 9 categories and 23 fine-grained types, experimentally evaluated with the CRF, BiLSTM, and BERTimbau architectures. The corpus was experimentally evaluated regarding predictive performance, computational cost and label-level results. The best micro F1 96.18% was achieved by BERTimbau on the unified corpus, providing a strong baseline for Brazilian legal NER. At the label level, six categories and seven types presented a F1-score above 95%, while the lowest were distributed in the interval 71-82%.
2024
RoBERTaLexPT: A Legal RoBERTa Model pretrained with deduplication for Portuguese
Eduardo Garcia | Nadia Silva | Felipe Siqueira | Juliana Gomes | Hidelberg O. Albuquerque | Ellen Souza | Eliomar Lima | André de Carvalho
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1
Eduardo Garcia | Nadia Silva | Felipe Siqueira | Juliana Gomes | Hidelberg O. Albuquerque | Ellen Souza | Eliomar Lima | André de Carvalho
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1
UlyssesNERQ: Expanding Queries from Brazilian Portuguese Legislative Documents through Named Entity Recognition
Hidelberg O. Albuquerque | Ellen Souza | Tainan Silva | Rafael P. Gouveia | Flavio Junior | Douglas Vitório | Nádia F. F. da Silva | André C.P.L.F. de Carvalho | Adriano L.I. Oliveira | Francisco Edmundo de Andrade
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1
Hidelberg O. Albuquerque | Ellen Souza | Tainan Silva | Rafael P. Gouveia | Flavio Junior | Douglas Vitório | Nádia F. F. da Silva | André C.P.L.F. de Carvalho | Adriano L.I. Oliveira | Francisco Edmundo de Andrade
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1
2021
Assessing the Impact of Stemming Algorithms Applied to Brazilian Legislative Documents Retrieval
Ellen Souza | Moriyama-Gyovana | Douglas Vitorio | Andre Carvalho | Nadia Felix | Hidelberg Albuquerque | Adriano Oliveira
Proceedings of the 13th Brazilian Symposium in Information and Human Language Technology
Ellen Souza | Moriyama-Gyovana | Douglas Vitorio | Andre Carvalho | Nadia Felix | Hidelberg Albuquerque | Adriano Oliveira
Proceedings of the 13th Brazilian Symposium in Information and Human Language Technology
Search
Fix author
Co-authors
- Hidelberg O. Albuquerque 3
- André C. P. L. F. de Carvalho 2
- Adriano L. I. Oliveira 2
- Douglas Vitório 2
- Nádia F. F. da Silva 2
- Hidelberg Albuquerque 1
- Héldon J. O. Albuquerque 1
- André Carvalho 1
- Márcio de S. Dias 1
- Nadia Felix 1
- Eduardo Garcia 1
- Juliana Gomes 1
- Rafael P. Gouveia 1
- Flavio Junior 1
- Eliomar Lima 1
- Danilo C. G. Lucena 1
- Moriyama-Gyovana 1
- Rafael O. Nunes 1
- Adriano Oliveira 1
- Nádia Silva 1
- Tainan Silva 1
- Felipe Siqueira 1
- Francisco Edmundo de Andrade 1
- André de Carvalho 1