tailieunhanh - Báo cáo hóa học: "Research Article A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation"

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation | Hindawi Publishing Corporation EURASIP Journal on Audio Speech and Music Processing Volume 2009 Article ID 239892 14 pages doi 2009 239892 Research Article A Decision-Tree-Based Algorithm for Speech Music Classification and Segmentation Yizhar Lavner1 and Dima Ruinskiy1 2 department of Computer Science Tel-Hai College Tel-Hai 12210 Israel 2Israeli Development Center Intel Corporation Haifa 31015 Israel Correspondence should be addressed to Yizhar Lavner yizhar_l@ Received 10 September 2008 Revised 5 January 2009 Accepted 27 February 2009 Recommended by Climent Nadeu We present an efficient algorithm for segmentation of audio signals into speech or music. The central motivation to our study is consumer audio applications where various real-time enhancements are often applied. The algorithm consists of a learning phase and a classification phase. In the learning phase predefined training data is used for computing various time-domain and frequency-domain features for speech and music signals separately and estimating the optimal speech music thresholds based on the probability density functions of the features. An automatic procedure is employed to select the best features for separation. In the test phase initial classification is performed for each segment of the audio signal using a three-stage sieve-like approach applying both Bayesian and rule-based methods. To avoid erroneous rapid alternations in the classification a smoothing technique is applied averaging the decision on each segment with past segment decisions. Extensive evaluation ofthe algorithm on a database of more than 12 hours of speech and more than 22 hours of music showed correct identification rates of and respectively and quick adjustment to alternating speech music sections. In addition to its accuracy and robustness the algorithm can be easily adapted to different audio types and is suitable for real-time operation. Copyright 2009 Y. Lavner and D. Ruinskiy. This

TÀI LIỆU LIÊN QUAN