Félicien Vallet

2016

Speech Trax: A Bottom to the Top Approach for Speaker Tracking and Indexing in an Archiving Context
Félicien Vallet | Jim Uro | Jérémy Andriamakaoly | Hakim Nabi | Mathieu Derval | Jean Carrive
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

With the increasing amount of audiovisual and digital data deriving from televisual and radiophonic sources, professional archives such as INA, France’s national audiovisual institute, acknowledge a growing need for efficient indexing tools. In this paper, we describe the Speech Trax system that aims at analyzing the audio content of TV and radio documents. In particular, we focus on the speaker tracking task that is very valuable for indexing purposes. First, we detail the overall architecture of the system and show the results obtained on a large-scale experiment, the largest to our knowledge for this type of content (about 1,300 speakers). Then, we present the Speech Trax demonstrator that gathers the results of various automatic speech processing techniques on top of our speaker tracking system (speaker diarization, speech transcription, etc.). Finally, we provide insight on the obtained performances and suggest hints for future improvements.

2014

pdf bib abs

An Effortless Way To Create Large-Scale Datasets For Famous Speakers
François Salmon | Félicien Vallet
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The creation of large-scale multimedia datasets has become a scientific matter in itself. Indeed, the fully-manual annotation of hundreds or thousands of hours of video and/or audio turns out to be practically infeasible. In this paper, we propose an extremly handy approach to automatically construct a database of famous speakers from TV broadcast news material. We then run a user experiment with a correctly designed tool that demonstrates that very reliable results can be obtained with this method. In particular, a thorough error analysis demonstrates the value of the approach and provides hints for the improvement of the quality of the dataset.

Co-authors

Jim Uro 1

Venues

lrec2

Fix author