Tone Question of Tree Based Context Clustering for Hidden Markov Model Based Thai Speech Synthesis
Abstract
Problem statement: In HMM-based Thai speech synthesis, tone is an important issue that brings about the intelligibility of the synthesized speech. Tone distortion resulted from imbalance of the training data should be appropriately treated. Approach: This study described an HMM-based speech synthesis system for Thai language. In the system, spectrum, pitch and state duration are modeled simultaneously in a unified framework of HMM, their parameter distributions are clustered independently by using a decision-tree based context clustering technique. The contextual factors which affect spectrum, pitch and duration, i.e., part of speech, position and number of phones in a syllable, position and number of syllables in a word, position and number of words in a sentence, phone type and tone type, are taken into account for constructing the questions of the decision tree. Since Thai is a tonal language, tone questions play an important role in the context clustering process. Results: An experimental result compared F0 contours between those of synthesized speech with and without tone questions; furthermore the size of Thai speech corpus is varied to investigate the synthesized speech quality. Conclusion: By using the tone questions in the tree-based context clustering process, the tone distortion is relieved significantly.
DOI: https://doi.org/10.3844/jcssp.2010.1474.1478
Copyright: © 2010 Suphattharachai Chomphan. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,165 Views
- 2,705 Downloads
- 0 Citations
Download
Keywords
- Thai tone
- tree-based context clustering
- HMM-based speech synthesis