Gisele Montilha Pinheiro
Also published as: Gisele Montilha
2004
The Lácio-Web: Corpora and Tools to Advance Brazilian Portuguese Language Investigations and Computational Linguistic Tools
Sandra Aluisio
|
Gisele Montilha Pinheiro
|
Aline M. P. Manfrin
|
Leandro H. M. de Oliveira
|
Luiz C. Genoves, Jr.
|
Stella E. O. Tagnin
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
In this paper we discuss the five requirements for building large publicly available corpora which geared the construction of the Lácio-Web corpora and their environments: 1) a comprehensive text typology; 2) text copyright clearance, compilation and annotation scheme; 3) a friendly and didactic interface; 4) the need to serve as support for several types of research; 5) the need to offer an array of associated tools. Also, we present the features that make Lácio-Web corpora interesting and novel as well as the limitations of this project, such as corpora size and balance, and the non-inclusion of spoken texts in the project’s reference corpus.
2000
An interlingua aiming at communication on the Web: How language-independent can it be?
Ronaldo Teixeira Martins
|
Lucia Helena Machado Rino
|
Maria das Gracas Volpe Nunes
|
Gisele Montilha
|
Osvaldo Novais de Oliveira
NAACL-ANLP 2000 Workshop: Applied Interlinguas: Practical Applications of Interlingual Approaches to NLP