tailieunhanh - Báo cáo hóa học: " Research Article Auditory Sparse Representation for Robust Speaker Recognition Based on Tensor Structure"

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article Auditory Sparse Representation for Robust Speaker Recognition Based on Tensor Structure | Hindawi Publishing Corporation EURASIP Journal on Audio Speech and Music Processing Volume 2008 Article ID 578612 9 pages doi 2008 578612 Research Article Auditory Sparse Representation for Robust Speaker Recognition Based on Tensor Structure Qiang Wu and Liqing Zhang Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai 200240 China Correspondence should be addressed to Liqing Zhang lqzhang@ Received 31 December 2007 Accepted 29 September 2008 Recommended by Woon-Seng Gan This paper investigates the problem of speaker recognition in noisy conditions. A new approach called nonnegative tensor principal component analysis NTPCA with sparse constraint is proposed for speech feature extraction. We encode speech as a general higher-order tensor in order to extract discriminative features in spectrotemporal domain. Firstly speech signals are represented by cochlear feature based on frequency selectivity characteristics at basilar membrane and inner hair cells then low-dimension sparse features are extracted by NTPCA for robust speaker modeling. The useful information of each subspace in the higher-order tensor can be preserved. Alternating projection algorithm is used to obtain a stable solution. Experimental results demonstrate that our method can increase the recognition accuracy specifically in noisy environments. Copyright 2008 Q. Wu and L. Zhang. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited. 1. INTRODUCTION Automatic speaker recognition has been developed into an important technology for various speech-based applications. Traditional recognition system usually comprises two processes feature extraction and speaker modeling. Conventional speaker modeling methods such as Gaussian mixture models GMMs 1 achieve very high performance for speaker .

TÀI LIỆU LIÊN QUAN