Susana Inácio


2008

pdf bib
What’s in a Colour? Studying and Contrasting Colours with COMPARA
Diana Santos | Maria do Rosário Silva | Susana Inácio
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we present contrastive colour studies done using COMPARA, the largest edited parallel corpus in the world (as far as we know). The studies were the result of semantic annotation of the corpus in this domain. We chose to start with colour because it is a relatively contained lexical category and the subject of many arguments in linguistics. We begin by explaining the criteria involved in the annotation process, not only for the colour categories but also for the colour groups created in order to do finer-grained analyses, presenting also some quantitative data regarding these categories and groups. We proceed to compare the two languages according to the diversity of available lexical items, morphological and syntactic properties, and then try to understand the translation of colour. We end by explaining how any user who wants to do serious studies using the corpus can collaborate in enhancing the corpus and making their semantic annotations widely available as well.

2006

pdf bib
Annotating COMPARA, a Grammar-aware Parallel Corpus
Diana Santos | Susana Inácio
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper we describe the annotation of COMPARA, currently the largest post-edited parallel corpora which include Portuguese. We describe the motivation, the results so far, and the way the corpus is being annotated. We also provide the first grounded results about syntactical ambiguity in Portuguese. Finally, we discuss some interesting problems in this connection.