tailieunhanh - Keyword Search in Databases- P24
Keyword Search in Databases- P24:Conceptually, a database can be viewed as a data graph GD(V ,E), where V represents a set of objects, and E represents a set of connections between objects. In this book, we concentrate on two kinds of databases, a relational database (RDB) and an XML database. In an RDB, an object is a tuple that consists of many attribute values where some attribute values are strings or full-text; there is a connection between two objects if there exists at least one reference from one to the other | 115 CHAPTER 5 Other Topics for Keyword Search on Databases In this chapter we discuss several interesting research issues regarding keyword search on databases. In Section we discuss some approaches that are proposed to select some RDB among many to answer a keyword query. In Section we discuss keyword search in a spatial database. In Section we introduce a PageRank based approach called ObjectRank in RDB and an approach that projects a database that only contains tuples relating to a keyword query. KEYWORD SEARCH ACROSS DATABASES There are two main issues to be considered in keyword search across multiple databases 1. When the number of databases is large a proper subset of databases need to be selected that are most suitable to answer a keyword query. This is the problem of keyword-based selection of the top-k databases and it is studied in M-KS Yu et al. 2007 and G-KS Vu et al. 2008 . 2. The keyword query needs to be executed across the databases that are selected. This problem is studied in Kite Sayyadian et al. 2007 . SELECTION OF DATABASES In order to rank a set of databases D Di D according to the their suitability to answer a certain keyword query Q a score function score D Q is defined for each database D e D. In the ideal case if the keyword query is evaluated in each database individually the best database to answer the query is the one that can generate high quality results. Suppose T Ti TJ is the set of results MTJNTs see Chapter 2 for query Q over database following equation can be used to score database D score D Q score T Q T eT where score T Q can be any scoring function for the MTJNT T as discussed in Chapter 2. In practice it is inefficient to evaluate Q on every database D e D. A straightforward way to solve the problem efficiently is to calculate the keyword statistics for each k e Q on each database D e D and summarize the statistics as a score reflecting the relevance of Q to D. There are two 116 5. OTHERTOPICS
đang nạp các trang xem trước