tailieunhanh - Báo cáo khoa học: "The Impact of Query Refinement in the Web People Search Task"

Searching for a person name in a Web Search Engine usually leads to a number of web pages that refer to several people sharing the same name. In this paper we study whether it is reasonable to assume that pages about the desired person can be filtered by the user by adding query terms. Our results indicate that, although in most occasions there is a query refinement that gives all and only those pages related to an individual, it is unlikely that the user is able to find this expression a priori. . | The Impact of Query Refinement in the Web People Search Task Javier Artiles UNED NLP IR group Madrid Spain javart@ Julio Gonzalo UNED NLP IR group Madrid Spain julio@ Enrique Amigo UNED NLP IR group Madrid Spain enrique@ Abstract Searching for a person name in a Web Search Engine usually leads to a number of web pages that refer to several people sharing the same name. In this paper we study whether it is reasonable to assume that pages about the desired person can be filtered by the user by adding query terms. Our results indicate that although in most occasions there is a query refinement that gives all and only those pages related to an individual it is unlikely that the user is able to find this expression a priori. 1 Introduction The Web has now become an essential resource to obtain information about individuals but at the same time its growth has made web people search WePS a challenging task because every single name is usually shared by many different people. One of the mainstream approaches to solve this problem is designing meta-search engines that cluster search results producing one cluster per person which contains all documents referring to this person. Up to now two evaluation campaigns - WePS 1 in 2007 Artiles et al. 2007 and WePS 2 in 2009 Artiles et al. 2009 - have produced datasets for this clustering task with over 15 research groups submitting results in each campaign. Since the release of the first datasets this task is becoming an increasingly popular research topic among Information Retrieval and Natural Language Processing researchers. For precision oriented queries for instance finding the homepage the email or the phone number of a given person clustered results might help locating the desired data faster while avoiding confusion with other people sharing the same name. But the utility of clustering is more obvious for recall oriented queries where the goal is to mine the web for information about a .