Michel Vacher

2020

Corpus Generation for Voice Command in Smart Home and the Effect of Speech Synthesis on End-to-End SLU
Thierry Desot | François Portet | Michel Vacher
Proceedings of the Twelfth Language Resources and Evaluation Conference

Massive amounts of annotated data greatly contributed to the advance of the machine learning field. However such large data sets are often unavailable for novel tasks performed in realistic environments such as smart homes. In this domain, semantically annotated large voice command corpora for Spoken Language Understanding (SLU) are scarce, especially for non-English languages. We present the automatic generation process of a synthetic semantically-annotated corpus of French commands for smart-home to train pipeline and End-to-End (E2E) SLU models. SLU is typically performed through Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) in a pipeline. Since errors at the ASR stage reduce the NLU performance, an alternative approach is End-to-End (E2E) SLU to jointly perform ASR and NLU. To that end, the artificial corpus was fed to a text-to-speech (TTS) system to generate synthetic speech data. All models were evaluated on voice commands acquired in a real smart home. We show that artificial data can be combined with real data within the same training set or used as a stand-alone training corpus. The synthetic speech quality was assessedby comparing it to real data using dynamic time warping (DTW).

2016

pdf bib abs

Acquisition et reconnaissance automatique d’expressions et d’appels vocaux dans un habitat. (Acquisition and recognition of expressions and vocal calls in a smart home)
Michel Vacher | Benjamin Lecouteux | Frédéric Aman | François Portet | Solange Rossato
Actes de la conférence conjointe JEP-TALN-RECITAL 2016. volume 1 : JEP

Cet article présente un système capable de reconnaître les appels à l’aide de personnes âgées vivant à domicile afin de leur fournir une assistance. Le système utilise une technologie de Reconnaissance Automatique de la Parole (RAP) qui doit fonctionner en conditions de parole distante et avec de la parole expressive. Pour garantir l’intimité, le système s’exécute localement et ne reconnaît que des phrases prédéfinies. Le système a été évalué par 17 participants jouant des scénarios incluant des chutes dans un Living lab reproduisant un salon. Le taux d’erreur de détection obtenu, 29%, est encourageant et souligne les défis à surmonter pour cette tâche.

pdf bib abs

Ambient Assisted Living aims at enhancing the quality of life of older and disabled people at home thanks to Smart Homes. In particular, regarding elderly living alone at home, the detection of distress situation after a fall is very important to reassure this kind of population. However, many studies do not include tests in real settings, because data collection in this domain is very expensive and challenging and because of the few available data sets. The C IRDO corpus is a dataset recorded in realistic conditions in D OMUS , a fully equipped Smart Home with microphones and home automation sensors, in which participants performed scenarios including real falls on a carpet and calls for help. These scenarios were elaborated thanks to a field study involving elderly persons. Experiments related in a first part to distress detection in real-time using audio and speech analysis and in a second part to fall detection using video analysis are presented. Results show the difficulty of the task. The database can be used as standardized database by researchers to evaluate and compare their systems for elderly person’s assistance.

pdf bib abs

CirdoX: an on/off-line multisource speech and sound analysis software
Frédéric Aman | Michel Vacher | François Portet | William Duclot | Benjamin Lecouteux
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Vocal User Interfaces in domestic environments recently gained interest in the speech processing community. This interest is due to the opportunity of using it in the framework of Ambient Assisted Living both for home automation (vocal command) and for call for help in case of distress situations, i.e. after a fall. C IRDO X, which is a modular software, is able to analyse online the audio environment in a home, to extract the uttered sentences and then to process them thanks to an ASR module. Moreover, this system perfoms non-speech audio event classification; in this case, specific models must be trained. The software is designed to be modular and to process on-line the audio multichannel stream. Some exemples of studies in which C IRDO X was involved are described. They were operated in real environment, namely a Living lab environment.

2015

pdf bib

Recognition of Distress Calls in Distant Speech Setting: a Preliminary Experiment in a Smart Home
Michel Vacher | Benjamin Lecouteux | Frédéric Aman | Solange Rossato | François Portet
Proceedings of SLPAT 2015: 6th Workshop on Speech and Language Processing for Assistive Technologies

2014

pdf bib abs

The Sweet-Home speech and multimodal corpus for home automation interaction
Michel Vacher | Benjamin Lecouteux | Pedro Chahuara | François Portet | Brigitte Meillon | Nicolas Bonnefond
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Ambient Assisted Living aims at enhancing the quality of life of older and disabled people at home thanks to Smart Homes and Home Automation. However, many studies do not include tests in real settings, because data collection in this domain is very expensive and challenging and because of the few available data sets. The S WEET-H OME multimodal corpus is a dataset recorded in realistic conditions in D OMUS, a fully equipped Smart Home with microphones and home automation sensors, in which participants performed Activities of Daily living (ADL). This corpus is made of a multimodal subset, a French home automation speech subset recorded in Distant Speech conditions, and two interaction subsets, the first one being recorded by 16 persons without disabilities and the second one by 6 seniors and 5 visually impaired people. This corpus was used in studies related to ADL recognition, context aware interaction and distant speech recognition applied to home automation controled through voice.

2013

pdf bib

Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies
Jan Alexandersson | Peter Ljunglöf | Kathleen F. McCoy | François Portet | Brian Roark | Frank Rudzicz | Michel Vacher
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies

pdf bib

Analyzing the Performance of Automatic Speech Recognition for Ageing Voice: Does it Correlate with Dependency Level?
Frédéric Aman | Michel Vacher | Solange Rossato | François Portet
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies

pdf bib

Experimental Evaluation of Speech Recognition Technologies for Voice-based Home Automation Control in a Smart Home
Michel Vacher | Benjamin Lecouteux | Dan Istrate | Thierry Joubert | François Portet | Mohamed Sehili | Pedro Chahuara
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies

2012

pdf bib

Reconnaissance automatique de la parole distante dans un habitat intelligent : méthodes multi-sources en conditions réalistes (Distant Speech Recognition in a Smart Home : Comparison of Several Multisource ASRs in Realistic Conditions) [in French]
Benjamin Lecouteux | Michel Vacher | François Portet
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 1: JEP

pdf bib

Etude de la performance des modèles acoustiques pour des voix de personnes âgées en vue de l’adaptation des systèmes de RAP (Assessment of the acoustic models performance in the ageing voice case for ASR system adaptation) [in French]
Frédéric Aman | Michel Vacher | Solange Rossato | Remus Dugheanu | François Portet | Juline le Grand | Yuko Sasa
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, volume 1: JEP

pdf bib

JEP-TALN-RECITAL 2012, Workshop ILADI 2012: Interactions Langagières pour personnes Agées Dans les habitats Intelligents (ILADI 2012: Language Interaction for Elderly in Smart Homes)
François Portet | Michel Vacher | Gilles Sérasset
JEP-TALN-RECITAL 2012, Workshop ILADI 2012: Interactions Langagières pour personnes Agées Dans les habitats Intelligents (ILADI 2012: Language Interaction for Elderly in Smart Homes)

pdf bib

Les technologies de la parole et du TALN pour l’assistance à domicile des personnes âgées : un rapide tour d’horizon (Quick tour of NLP and speech technologies for ambient assisted living) [in French]
François Portet | Michel Vacher | Solange Rossato
JEP-TALN-RECITAL 2012, Workshop ILADI 2012: Interactions Langagières pour personnes Agées Dans les habitats Intelligents (ILADI 2012: Language Interaction for Elderly in Smart Homes)

pdf bib

Reconnaissance d’ordres domotiques en conditions bruitées pour l’assistance à domicile (Recognition of Voice Commands by Multisource ASR and Noise Cancellation in a Smart Home Environment) [in French]
Benjamin Lecouteux | Michel Vacher | François Portet
JEP-TALN-RECITAL 2012, Workshop ILADI 2012: Interactions Langagières pour personnes Agées Dans les habitats Intelligents (ILADI 2012: Language Interaction for Elderly in Smart Homes)

pdf bib

Contribution à l’étude de la variabilité de la voix des personnes âgées en reconnaissance automatique de la parole (Contribution to the study of elderly people’s voice variability in automatic speech recognition) [in French]
Frédéric Aman | Michel Vacher | Solange Rossato | François Portet
JEP-TALN-RECITAL 2012, Workshop ILADI 2012: Interactions Langagières pour personnes Agées Dans les habitats Intelligents (ILADI 2012: Language Interaction for Elderly in Smart Homes)