Arnaldo Candido, Jr.

Also published as: Arnaldo Candido, Arnaldo Candido Jr, Arnaldo Candido Jr, Arnaldo Candido Jr.


pdf bib
Portal NURC-SP: Design, Development, and Speech Processing Corpora Resources to Support the Public Dissemination of Portuguese Spoken Language
Ana Carolina Rodrigues | Alessandra A. Macedo | Arnaldo Candido Jr | Flaviane R. F. Svartman | Giovana M. Craveiro | Marli Quadros Leite | Sandra M. Aluísio | Vinícius G. Santos | Vinícius M. Garcia
Proceedings of the 16th International Conference on Computational Processing of Portuguese


pdf bib
Deep Learning against COVID-19: Respiratory Insufficiency Detection in Brazilian Portuguese Speech
Edresson Casanova | Lucas Gris | Augusto Camargo | Daniel da Silva | Murilo Gazzola | Ester Sabino | Anna Levin | Arnaldo Candido Jr | Sandra Aluisio | Marcelo Finger
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021


pdf bib
Rhetorical Move Detection in English Abstracts: Multi-label Sentence Classifiers and their Annotated Corpora
Carmen Dayrell | Arnaldo Candido Jr. | Gabriel Lima | Danilo Machado Jr. | Ann Copestake | Valéria Feltrim | Stella Tagnin | Sandra Aluisio
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The relevance of automatically identifying rhetorical moves in scientific texts has been widely acknowledged in the literature. This study focuses on abstracts of standard research papers written in English and aims to tackle a fundamental limitation of current machine-learning classifiers: they are mono-labeled, that is, a sentence can only be assigned one single label. However, such approach does not adequately reflect actual language use since a move can be realized by a clause, a sentence, or even several sentences. Here, we present MAZEA (Multi-label Argumentative Zoning for English Abstracts), a multi-label classifier which automatically identifies rhetorical moves in abstracts but allows for a given sentence to be assigned as many labels as appropriate. We have resorted to various other NLP tools and used two large training corpora: (i) one corpus consists of 645 abstracts from physical sciences and engineering (PE) and (ii) the other corpus is made up of 690 from life and health sciences (LH). This paper presents our preliminary results and also discusses the various challenges involved in multi-label tagging and works towards satisfactory solutions. In addition, we also make our two training corpora publicly available so that they may serve as benchmark for this new task.


pdf bib
Towards an on-demand Simple Portuguese Wikipedia
Arnaldo Candido Jr | Ann Copestake | Lucia Specia | Sandra Maria Aluísio
Proceedings of the Second Workshop on Speech and Language Processing for Assistive Technologies


pdf bib
SIMPLIFICA: a tool for authoring simplified texts in Brazilian Portuguese guided by readability assessments
Carolina Scarton | Matheus Oliveira | Arnaldo Candido Jr. | Caroline Gasperin | Sandra Aluísio
Proceedings of the NAACL HLT 2010 Demonstration Session


pdf bib
Supporting the Adaptation of Texts for Poor Literacy Readers: a Text Simplification Editor for Brazilian Portuguese
Arnaldo Candido | Erick Maziero | Lucia Specia | Caroline Gasperin | Thiago Pardo | Sandra Aluisio
Proceedings of the Fourth Workshop on Innovative Use of NLP for Building Educational Applications