tailieunhanh - Multimedia_Data_Mining_08

Chapter 8 (MDM) | Chapter 7 A Multimodal Approach to Image Data Mining and Concept Discovery Introduction This chapter gives an example on multimedia data mining by addressing the automatic image annotation problem and its application to multimodal image data mining and retrieval. Specifically, in this chapter, we propose a prob- abilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy between the two modalities; the association of visual features and the textual words is determined in a Bayesian framework such that the confidence of the association can be pro- vided; and extensive evaluations on a large-scale, visually and semantically diverse image collection crawled from the Web are reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual features and the word layer is discovered by fitting a generative model to the training images and anno- tation words. An Expectation-Maximization (EM) based iterative learning procedure is developed to determine the conditional probabilities of the vi- sual features and the textual words given a hidden concept class. Based on the discovered hidden concept layer and the corresponding conditional prob- abilities, the image annotation and the text-to-image retrieval are performed using the Bayesian framework. The evaluations of the prototype system on 17,000 images and 7,736 automatically extracted annotation words from the crawled Web pages for multimodal image data mining and retrieval have in- dicated that the model and the framework are superior to a state-of-the-art peer system in the literature. The rest of the chapter is organized as follows: Section introduces the motivations to this work and outlines the main contributions of this work. Section discusses the related work on image annotation and .