Evaluating Low-Level Speech Features Against Human Perceptual Data

Caitlin Richter; Naomi Feldman; Harini Salgado; Aren Jansen

doi:10.1162/tacl_a_00071

Evaluating Low-Level Speech Features Against Human Perceptual Data

Caitlin Richter, Naomi H. Feldman, Harini Salgado, Aren Jansen

Abstract

We introduce a method for measuring the correspondence between low-level speech features and human perception, using a cognitive model of speech perception implemented directly on speech recordings. We evaluate two speaker normalization techniques using this method and find that in both cases, speech features that are normalized across speakers predict human data better than unnormalized speech features, consistent with previous research. Results further reveal differences across normalization methods in how well each predicts human data. This work provides a new framework for evaluating low-level representations of speech on their match to human perception, and lays the groundwork for creating more ecologically valid models of speech perception.

Anthology ID:: Q17-1030
Volume:: Transactions of the Association for Computational Linguistics, Volume 5
Month:
Year:: 2017
Address:: Cambridge, MA
Editors:: Lillian Lee, Mark Johnson, Kristina Toutanova
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 425–440
Language:
URL:: https://aclanthology.org/Q17-1030/
DOI:: 10.1162/tacl_a_00071
Bibkey:
Cite (ACL):: Caitlin Richter, Naomi H. Feldman, Harini Salgado, and Aren Jansen. 2017. Evaluating Low-Level Speech Features Against Human Perceptual Data. Transactions of the Association for Computational Linguistics, 5:425–440.
Cite (Informal):: Evaluating Low-Level Speech Features Against Human Perceptual Data (Richter et al., TACL 2017)
Copy Citation:
PDF:: https://aclanthology.org/Q17-1030.pdf

PDF Cite Search Fix data