tailieunhanh - Digital Signal Processing Handbook P46
Text-to-speech synthesis has had a long history, one that can be traced back at least to Dudley’s “Voder”, developed at Bell Laboratories and demonstrated at the 1939 World’s Fair [1]. Practical systems for automatically generating speech parameters from a linguistic representation (such as a phoneme string) were not available until the 1960s, and systems for converting from ordinary text into | Sproat R. Olive J. Text-to-Speech Synthesis Digital Signal Processing Handbook Ed. Vijay K. Madisetti and Douglas B. Williams Boca Raton CRC Press LLC 1999 1999 by CRC Press LLC 46 Text-to-Speech Synthesis Richard Sproat Bell Laboratories Lucent Technologies Joseph Olive Bell Laboratories Lucent Technologies Introduction Text Analysis and Linguistic Analysis Text Preprocessing Accentuation Word Pronunciation International Phrasing Segmental Durations Intonation Speech Synthesis The Future of TTS References Introduction Text-to-speech synthesis has had a long history one that can be traced back at least to Dudley s Voder developed at Bell Laboratories and demonstrated at the 1939 World s Fair 1 . Practical systems for automatically generating speech parameters from a linguistic representation such as a phoneme string were not available until the 1960s and systems for converting from ordinary text into speech were first completed in the 1970s with MITalk being the best-known such system 2 . Many projects in text-to-speech conversion have been initiated in the intervening years and papers on many of these systems have been It is tempting to think of the problem of converting written text into speech as speech recognition in reverse current speech recognition systems are generally deemed successful if they can convert speech input into the sequence of words that was uttered by the speaker so one might imagine that a text-to-speech TTS synthesizer would start with the words in the text convert each word one-by-one into speech being careful to pronounce each word correctly and concatenate the result together. However when one considers what literate native speakers of a language must do when they read a text aloud it quickly becomes clear that things are much more complicated than this simplistic view suggests. Pronouncing words correctly is only part of the problem faced by human readers in order to sound natural and to sound as if .
đang nạp các trang xem trước