tailieunhanh - Báo cáo hóa học: " Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN"

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN | Hindawi Publishing Corporation EURASIP Journal on Applied Signal Processing Volume 2006 Article ID 95491 Pages 1-11 DOI ASP 2006 95491 Robust Distant Speech Recognition by Combining Multiple Microphone-Array Processing with Position-Dependent CMN Longbiao Wang Norihide Kitaoka and Seiichi Nakagawa Department of Information and Computer Sciences Toyohashi University of Technology Toyahashi-shi 441-8580 Japan Received 29 December 2005 Revised 20 May 2006 Accepted 11 June 2006 We propose robust distant speech recognition by combining multiple microphone-array processing with position-dependent cep-stral mean normalization CMN . In the recognition stage the system estimates the speaker position and adopts compensation parameters estimated a priori corresponding to the estimated position. Then the system applies CMN to the speech . positiondependent CMN and performs speech recognition for each channel. The features obtained from the multiple channels are integrated with the following two types of processings. The first method is to use the maximum vote or the maximum summation likelihood of recognition results from multiple channels to obtain the final result which is called multiple-decoder processing. The second method is to calculate the output probability of each input at frame level and a single decoder using these output probabilities is used to perform speech recognition. This is called single-decoder processing resulting in lower computational cost. We combine the delay-and-sum beamforming with multiple-decoder processing or single-decoder processing which is termed multiple microphone-array processing. We conducted the experiments of our proposed method using a limited vocabulary 100 words distant isolated word recognition in a real environment. The proposed multiple microphone-array processing using multiple decoders with position-dependent CMN achieved a improvement 50 relative error reduction rate over the delay-and-sum beamforming with .

TÀI LIỆU LIÊN QUAN