tailieunhanh - Handbook of Multimedia for Digital Entertainment and Arts- P13

Handbook of Multimedia for Digital Entertainment and Arts- P13: The advances in computer entertainment, multi-player and online games, technology-enabled art, culture and performance have created a new form of entertainment and art, which attracts and absorbs their participants. The fantastic success of this new field has influenced the development of the new digital entertainment industry and related products and services, which has impacted every aspect of our lives. | 354 K. Brandenburg et al. Zero Crossing Rate The Zerocrossing Rate ZCR simply counts the number of changes of the signum in audio frames. Since the number of crossings depends on the size of the examined window the final value has to be normalized by dividing by the actual window size. One of the first evaluations of the zerocrossing rate in the area of speech recognition have been described by Licklider and Pollack in 1948 63 . They described the feature extraction process and resulted with the conclusion that the ZCR is useful for digital speech signal processing because it is loudness invariant and speaker independent. Among the variety of publications using the ZCR for MIR are the fundamental genre identification paper from Tzanetakis et al. 110 and a paper dedicated to the classification of percussive sounds by Gouyon 39 . Audio Spectrum Centroid The Audio Spectrum Centroid ASC is another MPEG-7 standardized low-level feature in MIR 88 . As depicted in 53 it describes the center of gravity of the spectrum. It is used to describe the timbre of an audio signal. The feature extraction process is similar to the ASE extraction. The difference between ASC and ASE is that the values within the edges of the logarithmically spaced frequency bands are not accumulated but the spectrum centroid is estimated. This spectrum centroid indicates the center of gravity inside the frequency bands. Audio Spectrum Spread Audio Spectrum Spread ASS is another feature described in the MPEG-7 standard. It is a descriptor of the shape of the power spectrum that indicates whether it is concentrated in the vicinity of its centroid or else spread out over the spectrum. The difference between ASE and ASS is that the values within the edges of the logarithmically spaced frequency bands are not accumulated but the spectrum spread is estimated as described in 53 . The spectrum spread allows a good differentiation between tone-like and noise-like sounds. Mid-level Audio Features Mid-level .

TỪ KHÓA LIÊN QUAN