Raquel Amaro

2026

From Complexity Scores to Readable Texts: iRead4Skills for Adult Literacy in Portuguese
Jorge Baptista | Eugénio Ribeiro | Nuno Mamede | David Antunes | Raquel Amaro
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2

Adult Learning (AL) programmes need short, trustworthy texts that match learners’ reading abilities, but educators rarely have time, tools, or evidence-based guidelines to select and adapt materials consistently.We present a live demo of iRead4Skills for European Portuguese: a web-based system that (i) estimates readability/complexity for AL-oriented levels aligned with CEFR, (ii) highlights where complexity concentrates (lexical, grammatical, semantic), and (iii) supports rewriting by offering actionable, level-aware suggestions and curated lexical resources.The demo emphasises transparency and “trainer-first” workflows: users see *why* a text is complex and *how* to revise it without losing meaning.

2025

We present the iRead4Skills Intelligent Complexity Analyzer, an open-access platform specifically designed to assist educators and content developers in addressing the needs of low-literacy adults by analyzing and diagnosing text complexity. This multilingual system integrates a range of Natural Language Processing (NLP) components to assess input texts along multiple levels of granularity and linguistic dimensions in Portuguese, Spanish, and French. It assigns four tailored difficulty levels using state-of-the-art models, and introduces four diagnostic yardsticks—textual structure, lexicon, syntax, and semantics—offering users actionable feedback on specific dimensions of textual complexity. Each component of the system is supported by experiments comparing alternative models on manually annotated data.

2024

pdf bib

Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 2
Pablo Gamallo | Daniela Claro | António Teixeira | Livy Real | Marcos Garcia | Hugo Gonçalo Oliveira | Raquel Amaro
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 2

pdf bib

Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1
Pablo Gamallo | Daniela Claro | António Teixeira | Livy Real | Marcos Garcia | Hugo Gonçalo Oliveira | Raquel Amaro
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1

2014

pdf bib abs

Extracting semantic relations from Portuguese corpora using lexical-syntactic patterns
Raquel Amaro
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The growing investment on automatic extraction procedures, together with the need for extensive resources, makes semi-automatic construction a new viable and efficient strategy for developing of language resources, combining accuracy, size, coverage and applicability. These assumptions motivated the work depicted in this paper, aiming at the establishment and use of lexical-syntactic patterns for extracting semantic relations for Portuguese from corpora, part of a larger ongoing project for the semi-automatic extension of WordNet.PT. 26 lexical-syntactic patterns were established, covering hypernymy/hyponymy and holonymy/meronymy relations between nominal items, and over 34 000 contexts were manually analyzed to evaluate the productivity of each pattern. The set of patterns and respective examples are given, as well as data concerning the extraction of relations - right hits, wrong hits and related hits-, and the total of occurrences of each pattern in CPRC. Although language-dependent, and thus clearly of obvious interest for the development of lexical resources for Portuguese, the results depicted in this paper are also expected to be helpful as a basis for the establishment of patterns for related languages such as Spanish, Catalan, French or Italian.

pdf bib abs

LexTec — a rich language resource for technical domains in Portuguese
Palmira Marrafa | Raquel Amaro | Sara Mendes
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The growing amount of available information and the importance given to the access to technical information enhance the potential role of NLP applications in enabling users to deal with information for a variety of knowledge domains. In this process, language resources are crucial. This paper presents Lextec, a rich computational language resource for technical vocabulary in Portuguese. Encoding a representative set of terms for ten different technical domains, this concept-based relational language resource combines a wide range of linguistic information by integrating each entry in a domain-specific wordnet and associating it with a precise definition for each lexicalization in the technical domain at stake, illustrative texts and information for translation into English.

This paper presents some aspects of the first Portuguese frequency lexicon extracted from a corpus of large dimensions. The Multifunctional Computational Lexicon of Contemporary Portuguese (henceforth MCL) rised from the necessity of filling a gap existent in the studies of the contemporary Portuguese. Until recently, the frequency lexicons of Portuguese were of very small dimensions, such as Português Fundamental, which is constituted by 2.217 words extracted from a 700.000 word corpus and the Frequency Dictionary of Portuguese Words based on a literary corpus of 500.000 words. We describe here the main steps taken for collecting the lexical and frequency data and some of the major problems that arouse in the process. The resulting lexicon is a freely available reliable resource for several types of applications.

Venues

Fix author

Raquel Amaro

2026

2025

2024

2014

2012

2011

2004

Co-authors

Venues