Đang chuẩn bị liên kết để tải về tài liệu:
An efficient hardware architecture for HMM-based TTS system

Thủy Tiên 87 8 pdf

Đang chuẩn bị nút TẢI XUỐNG, xin hãy chờ Tải xuống

This work proposes a hardware architecture for HMM-based text-to-speech synthesis system (HTS). In high speed platforms, HTS with software core-engine can satisfy the requirement of real-time processing. However, in low speed platforms, software core-engine consumes long time-cost to complete the synthesis process. A co-processor was designed and integrated into HTS to accelerate the performance of system. | Science & Technology Development, Vol 18, No.T4-2015 An efficient hardware architecture for HMM-based TTS system Su Hong Kiet Huynh Huu Thuan Bui Trong Tu University of Sciences, VNU-HCM (Received on December 05 th 2014, accepted on September 23rd 2015) ABSTRACT This work proposes a hardware platforms, software core-engine consumes architecture for HMM-based text-to-speech long time-cost to complete the synthesis synthesis system (HTS). In high speed process. A co-processor was designed and platforms, HTS with software core-engine integrated into HTS to accelerate the can satisfy the requirement of real-time performance of system. processing. However, in low speed Keywords: text-to-speech synthesis, HMM, HTS, SoPC, FPGA. INTRODUCTION A HTS consists two parts of training part and synthesis part as shown in Fig. 1. In the training part, a context-dependent HMM database is trained from a speech database. The trained context-dependent HMM database consists of models for spectrum, pitch and state duration; and decision trees for spectrum, pitch and state duration. Then, the trained context-dependent HMM database is used by the synthesis part to generate the speech waveform from the given text. Fig. 1. Scheme of HTS Trang 210 TAÏP CHÍ PHAÙT TRIEÅN KH&CN, TAÄP 18, SOÁ T4- 2015 In the synthesis part, the given text is analyzed and converted into label a sequence. According to the label sequence, an HMM sentence is constructed by concatenating HMMs taken form the trained HMM database. And then, excitation and spectral parameters are extracted from HMM sentence. The extracted excitation and spectral parameters are fed to a synthesis filter to synthesize speech waveform. Depending on the fact that the spectral parameter is presented as mel-cesptral coefficients or melgeneralized cepstral coefficients, the synthesis filter is constructed as an MLSA filter or an MGLSA filter, respectively. In recent research, HTS is applied to many languages such as Japanese [1],

TÀI LIỆU LIÊN QUAN

A robusat and efficient data transmission in adhoc networks

A closed-form solution for a queueing model of energy efficient ethernet links

Lecture Financial markets - Lecture 6: Efficient markets and excess volatility

A dedicated architecture for efficient web server technology

A supporting efficient and dynamic multicasting over multiple regions in mobile Ad hoc networks

Improving energy efficient QOS performance for heterogeneous MANET

An efficient fingerprint compression algorithm using sparse coding

The study of the energy efficient protocols (modleach, sep and deec)

SVD based dimensionality reduction for efficient web page classification

Secure cloud data storage with efficient key exposure