tailieunhanh - Báo cáo khoa học: "Discriminating image senses by clustering with multimodal features"

We discuss Image Sense Discrimination (ISD), and apply a method based on spectral clustering, using multimodal features from the image and text of the embedding web page. We evaluate our method on a new data set of annotated web images, retrieved with ambiguous query terms. Experiments investigate different levels of sense granularity, as well as the impact of text and image features, and global versus local text features. | Discriminating image senses by clustering with multimodal features Nicolas Loeff Dept. of Computer Science University of Illinois UC loeff@ Cecilia Ovesdotter Alm Dept. of Linguistics University of Illinois UC ebbaalm@ David A. Forsyth Dept. of Computer Science University of Illinois UC daf@ Abstract We discuss Image Sense Discrimination ISD and apply a method based on spectral clustering using multimodal features from the image and text of the embedding web page. We evaluate our method on a new data set of annotated web images retrieved with ambiguous query terms. Experiments investigate different levels of sense granularity as well as the impact of text and image features and global versus local text features. 1 Introduction and problem clarification Semantics extends beyond words. We focus on image sense discrimination ISD 1 for web images retrieved from ambiguous keywords given a multimodal feature set including text from the document which the image was embedded in. For instance a search for CRANE retrieves images of crane machines crane birds associated other machinery or animals etc. people as well as images of irrelevant meanings. Current displays for image queries . Google or Yahoo simply list retrieved images in any order. An application is a user display where images are presented in semantically sensible clusters for improved image browsing. Another usage of the presented model is automatic creation of sense discriminated image data sets and determining available image senses automatically. ISD differs from word sense discrimination and disambiguation WSD by increased complexity in several respects. As an initial complication both word and iconographic sense distinctions 1Cf. Schutze 1998 for a definition of sense discrimination in NLP. matter. Whereas a search term like CRANE can refer to . a MACHINE or a BIRD iconographic distinctions could additionally include birds standing vs. in a marsh land or flying . .