Comparison of Grapheme-to-Phoneme Conversion Methods on a Myanmar Pronunciation Dictionary

Ye Kyaw Thu, Win Pa Pa, Yoshinori Sagisaka, Naoto Iwahashi


Abstract
Grapheme-to-Phoneme (G2P) conversion is the task of predicting the pronunciation of a word given its graphemic or written form. It is a highly important part of both automatic speech recognition (ASR) and text-to-speech (TTS) systems. In this paper, we evaluate seven G2P conversion approaches: Adaptive Regularization of Weight Vectors (AROW) based structured learning (S-AROW), Conditional Random Field (CRF), Joint-sequence models (JSM), phrase-based statistical machine translation (PBSMT), Recurrent Neural Network (RNN), Support Vector Machine (SVM) based point-wise classification, Weighted Finite-state Transducers (WFST) on a manually tagged Myanmar phoneme dictionary. The G2P bootstrapping experimental results were measured with both automatic phoneme error rate (PER) calculation and also manual checking in terms of voiced/unvoiced, tones, consonant and vowel errors. The result shows that CRF, PBSMT and WFST approaches are the best performing methods for G2P conversion on Myanmar language.
Anthology ID:
W16-3702
Volume:
Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venues:
WS | WSSANLP
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
11–22
Language:
URL:
https://aclanthology.org/W16-3702
DOI:
Bibkey:
Cite (ACL):
Ye Kyaw Thu, Win Pa Pa, Yoshinori Sagisaka, and Naoto Iwahashi. 2016. Comparison of Grapheme-to-Phoneme Conversion Methods on a Myanmar Pronunciation Dictionary. In Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016), pages 11–22, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Comparison of Grapheme-to-Phoneme Conversion Methods on a Myanmar Pronunciation Dictionary (Kyaw Thu et al., 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-3702.pdf