Speech Trax: A Bottom to the Top Approach for Speaker Tracking and Indexing in an Archiving Context

Félicien Vallet, Jim Uro, Jérémy Andriamakaoly, Hakim Nabi, Mathieu Derval, Jean Carrive


Abstract
With the increasing amount of audiovisual and digital data deriving from televisual and radiophonic sources, professional archives such as INA, France’s national audiovisual institute, acknowledge a growing need for efficient indexing tools. In this paper, we describe the Speech Trax system that aims at analyzing the audio content of TV and radio documents. In particular, we focus on the speaker tracking task that is very valuable for indexing purposes. First, we detail the overall architecture of the system and show the results obtained on a large-scale experiment, the largest to our knowledge for this type of content (about 1,300 speakers). Then, we present the Speech Trax demonstrator that gathers the results of various automatic speech processing techniques on top of our speaker tracking system (speaker diarization, speech transcription, etc.). Finally, we provide insight on the obtained performances and suggest hints for future improvements.
Anthology ID:
L16-1318
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2011–2016
Language:
URL:
https://aclanthology.org/L16-1318
DOI:
Bibkey:
Cite (ACL):
Félicien Vallet, Jim Uro, Jérémy Andriamakaoly, Hakim Nabi, Mathieu Derval, and Jean Carrive. 2016. Speech Trax: A Bottom to the Top Approach for Speaker Tracking and Indexing in an Archiving Context. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2011–2016, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Speech Trax: A Bottom to the Top Approach for Speaker Tracking and Indexing in an Archiving Context (Vallet et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1318.pdf