Text-dependent Forensic Voice Comparison: Likelihood Ratio Estimation with the Hidden Markov Model (HMM) and Gaussian Mixture Model

Satoru Tsuge, Shunichi Ishihara


Abstract
Among the more typical forensic voice comparison (FVC) approaches, the acoustic-phonetic statistical approach is suitable for text-dependent FVC, but it does not fully exploit available time-varying information of speech in its modelling. The automatic approach, on the other hand, essentially deals with text-independent cases, which means temporal information is not explicitly incorporated in the modelling. Text-dependent likelihood ratio (LR)-based FVC studies, in particular those that adopt the automatic approach, are few. This preliminary LR-based FVC study compares two statistical models, the Hidden Markov Model (HMM) and the Gaussian Mixture Model (GMM), for the calculation of forensic LRs using the same speech data. FVC experiments were carried out using different lengths of Japanese short words under a forensically realistic, but challenging condition: only two speech tokens for model training and LR estimation. Log-likelihood-ratio cost (Cllr) was used as the assessment metric. The study demonstrates that the HMM system constantly outperforms the GMM system in terms of average Cllr values. However, words longer than three mora are needed if the advantage of the HMM is to become evident. With a seven-mora word, for example, the HMM outperformed the GMM by a Cllr value of 0.073.
Anthology ID:
U18-1002
Volume:
Proceedings of the Australasian Language Technology Association Workshop 2018
Month:
December
Year:
2018
Address:
Dunedin, New Zealand
Venue:
ALTA
SIG:
Publisher:
Note:
Pages:
17–25
Language:
URL:
https://aclanthology.org/U18-1002
DOI:
Bibkey:
Cite (ACL):
Satoru Tsuge and Shunichi Ishihara. 2018. Text-dependent Forensic Voice Comparison: Likelihood Ratio Estimation with the Hidden Markov Model (HMM) and Gaussian Mixture Model. In Proceedings of the Australasian Language Technology Association Workshop 2018, pages 17–25, Dunedin, New Zealand.
Cite (Informal):
Text-dependent Forensic Voice Comparison: Likelihood Ratio Estimation with the Hidden Markov Model (HMM) and Gaussian Mixture Model (Tsuge & Ishihara, ALTA 2018)
Copy Citation:
PDF:
https://aclanthology.org/U18-1002.pdf