tailieunhanh - Speech Recognition using Neural Networks

This thesis examines how artificial neural networks can benefit a large vocabulary, speaker independent, continuous speech recognition system. Currently, most speech recognition systems are based on hidden Markov models (HMMs), a statistical framework that supports both acoustic and temporal modeling. Despite their state-of-the-art performance, HMMs make a number of suboptimal modeling assumptions that limit their potential effectiveness. Neural networks avoid many of these assumptions, while they can also learn complex functions, generalize effectively, tolerate noise, and support parallelism. While neural networks can readily be applied to acoustic modeling, it is not yet clear how they can be used for temporal modeling | Speech Recognition using Neural Networks Joe Tebelskis May 1995 CMU-CS-95-142 School of Computer Science Carnegie Mellon University Pittsburgh Pennsylvania 15213-3890 Submitted in partial fulfillment of the requirements for a degree of Doctor of Philosophy in Computer Science Thesis Committee Alex Waibel chair Raj Reddy Jaime Carbonell Richard Lippmann MIT Lincoln Labs Copyright 1995 Joe Tebelskis This research was supported during separate phases by ATR Interpreting Telephony Research Laboratories NEC Corporation Siemens AG the National Science Foundation the Advanced Research Projects Administration and the Department of Defense under Contract No. MDA904-92-C-5161. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies either expressed or implied of ATR NEC Siemens NSF or the United States Government. Keywords Speech recognition neural networks hidden Markov models hybrid systems acoustic modeling prediction classification probability estimation discrimination global optimization. iii Abstract This thesis examines how artificial neural networks can benefit a large vocabulary speaker independent continuous speech recognition system. Currently most speech recognition systems are based on hidden Markov models HMMs a statistical framework that supports both acoustic and temporal modeling. Despite their state-of-the-art performance HMMs make a number of suboptimal modeling assumptions that limit their potential effectiveness. Neural networks avoid many of these assumptions while they can also learn complex functions generalize effectively tolerate noise and support parallelism. While neural networks can readily be applied to acoustic modeling it is not yet clear how they can be used for temporal modeling. Therefore we explore a class of systems called NN-HMMhybrids in which neural networks perform acoustic modeling and HMMs perform temporal modeling. We argue that a NN-HMM hybrid

TỪ KHÓA LIÊN QUAN