Automatic Period Segmentation of Oral French

Natalia Kalashnikova, Loïc Grobol, Iris Eshkol-Taravella, François Delafontaine


Abstract
Natural Language Processing in oral speech segmentation is still looking for a minimal unit to analyze. In this work, we present a comparison of two automatic segmentation methods of macro-syntactic periods which allows to take into account syntactic and prosodic components of speech. We compare the performances of an existing tool Analor (Avanzi, Lacheret-Dujour, Victorri, 2008) developed for automatic segmentation of prosodic periods and of CRF models relying on syntactic and / or prosodic features. We find that Analor tends to divide speech into smaller segments and that CRF models detect larger segments rather than macro-syntactic periods. However, in general CRF models perform better results than Analor in terms of F-measure.
Anthology ID:
2020.lrec-1.785
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6389–6394
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.785
DOI:
Bibkey:
Cite (ACL):
Natalia Kalashnikova, Loïc Grobol, Iris Eshkol-Taravella, and François Delafontaine. 2020. Automatic Period Segmentation of Oral French. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 6389–6394, Marseille, France. European Language Resources Association.
Cite (Informal):
Automatic Period Segmentation of Oral French (Kalashnikova et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.785.pdf