Hugo Meinedo


pdf bib
Revising the annotation of a Broadcast News corpus: a linguistic approach
Vera Cabarrão | Helena Moniz | Fernando Batista | Ricardo Ribeiro | Nuno Mamede | Hugo Meinedo | Isabel Trancoso | Ana Isabel Mata | David Martins de Matos
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents a linguistic revision process of a speech corpus of Portuguese broadcast news focusing on metadata annotation for rich transcription, and reports on the impact of the new data on the performance for several modules. The main focus of the revision process consisted on annotating and revising structural metadata events, such as disfluencies and punctuation marks. The resultant revised data is now being extensively used, and was of extreme importance for improving the performance of several modules, especially the punctuation and capitalization modules, but also the speech recognition system, and all the subsequent modules. The resultant data has also been recently used in disfluency studies across domains.

pdf bib
VOCE Corpus: Ecologically Collected Speech Annotated with Physiological and Psychological Stress Assessments
Ana Aguiar | Mariana Kaiseler | Hugo Meinedo | Pedro Almeida | Mariana Cunha | Jorge Silva
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Public speaking is a widely requested professional skill, and at the same time an activity that causes one of the most common adult phobias (Miller and Stone, 2009). It is also known that the study of stress under laboratory conditions, as it is most commonly done, may provide only limited ecological validity (Wilhelm and Grossman, 2010). Previously, we introduced an inter-disciplinary methodology to enable collecting a large amount of recordings under consistent conditions (Aguiar et al., 2013). This paper introduces the VOCE corpus of speech annotated with stress indicators under naturalistic public speaking (PS) settings, and makes it available at The novelty of this corpus is that the recordings are carried out in objectively stressful PS situations, as recommended in (Zanstra and Johnston, 2011). The current database contains a total of 38 recordings, 13 of which contain full psychologic and physiologic annotation. We show that the collected recordings validate the assumptions of the methodology, namely that participants experience stress during the PS events. We describe the various metrics that can be used for physiologic and psychologic annotation, and we characterise the sample collected so far, providing evidence that demographics do not affect the relevant psychologic or physiologic annotation. The collection activities are on-going, and we expect to increase the number of complete recordings in the corpus to 30 by June 2014.


pdf bib
Meet EDGAR, a tutoring agent at MONSERRATE
Pedro Fialho | Luísa Coheur | Sérgio Curto | Pedro Cláudio | Ângela Costa | Alberto Abad | Hugo Meinedo | Isabel Trancoso
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations


pdf bib
The COST278 Pan-European Broadcast News Database
An Vandecatseye | Jean-Pierre Martens | Joao Neto | Hugo Meinedo | Carmen Garcia-Mateo | Javier Dieguez | France Mihelic | Janez Zibert | Jan Nouza | Petr David | Matus Pleva | Anton Cizmar | Harris Papageorgiou | Christina Alexandris
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)