tailieunhanh - Báo cáo hóa học: " Research Article Comparison of Image Transform-Based Features for Visual Speech Recognition in Clean and Corrupted Videos"

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article Comparison of Image Transform-Based Features for Visual Speech Recognition in Clean and Corrupted Videos | Hindawi Publishing Corporation EURASIP Journal on Image and Video Processing Volume 2008 Article ID 810362 9 pages doi 2008 810362 Research Article Comparison of Image Transform-Based Features for Visual Speech Recognition in Clean and Corrupted Videos Rowan Seymour Darryl Stewart and Ji Ming School of Electronics Electrical Engineering and Computer Science Queen s University of Belfast Belfast BT71NN Northern Ireland UK Correspondence should be addressed to Darryl Stewart Received 28 February 2007 Revised 13 September 2007 Accepted 17 December 2007 Recommended by Nikos Nikolaidis We present results of a study into the performance of a variety of different image transform-based feature types for speakerindependent visual speech recognition of isolated digits. This includes the first reported use of features extracted using a discrete curvelet transform. The study will show a comparison of some methods for selecting features of each feature type and show the relative benefits of both static and dynamic visual features. The performance of the features will be tested on both clean video data and also video data corrupted in a variety of ways to assess each feature type s robustness to potential real-world conditions. One of the test conditions involves a novel form of video corruption we call jitter which simulates camera and or head movement during recording. Copyright 2008 Rowan Seymour et al. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited. 1. INTRODUCTION Speech is one of the most natural and important means of communication between people. Automatic speech recognition ASR can be described as the process of converting an audio speech signal into a sequence of words by computer. This allows people to interact with computers in a way which may be more natural than through .

TÀI LIỆU LIÊN QUAN