- 作者: 吳宗憲; 陳昭宏
- 作者服務機構: 國立成功大學資訊工程研究所
- 中文摘要: 本論文中,我們提出了一利用拜氏網路以產生音韻訊息之方法。本系統採用1410個國語單音作為語音合成單元。為加強傳統方法中利用規則作為音韻訊息調整之方式,本系統利用112個音韻平衡句及250個挑選之文句作為訓練資料庫。音韻訊息包含音高走勢、音節強度、音節長度及音節間距。此外我們利用基週同步疊加演算法來調整音高走勢。在實驗方面,我們測試20個聽眾,結果顯示平均可辨度為97.0%,而在自然度方面,平均鑑定分數(MOS)為3.8分。
- 英文摘要: In this paper, a novel approach based on Bayesian networks to the generation of prosodic informationis proposed. A set of 1410 Mandarin syllables is adopted as the basic synthesis units. To enhance thetraditional rule-based approach to the generation of prosodic information, the Bayesian network is employedto model the relation between the prosodic information and the linguistic features. This network is trainedwith a set of 112 phonetically balanced sentences and 250 sentences selected from newspapers andtextbooks. Given a Chinese character sequence, the Bayesian network can provide appropriate prosodicinformation including pitch contour, syllable intensity, syllable duration and pause duration. Furthermore,pitch contour modification is achieved by modifying the waveform output using the pitch-synchronousoverlap-and-add (PSOLA) method. The synthesized speech has been tested on 20 subjects. The resultsindicated that the average correct rate was 97.0% for intelligibility, and that the mean opinion score (MOS)was 3.8 for naturalness.
- 中文關鍵字: text-to-speech conversion; prosodic information; pitch contour; Bayesian network
- 英文關鍵字: --