tailieunhanh - Keyword Search in Databases- P10
Keyword Search in Databases- P10:Conceptually, a database can be viewed as a data graph GD(V ,E), where V represents a set of objects, and E represents a set of connections between objects. In this book, we concentrate on two kinds of databases, a relational database (RDB) and an XML database. In an RDB, an object is a tuple that consists of many attribute values where some attribute values are strings or full-text; there is a connection between two objects if there exists at least one reference from one to the other | 44 2. SCHEMA-BASED KEYWORD SEARCH ON RELATIONAL DATABASES as dis t c . If dis t ki dis t c Dmax the tuple t will be projected from the RDB. Here both dis tt c and dis t ki are in the range of 0 Dmax . In this phase all such tuples t will be projected which are sufficient to compute all multi-center communities because the set of such tuples contain every keyword-tuple center-tuple and path-tuple to compute all communities. This is illustrated in Figure c when l 2. The new DC algorithm to compute communities under distinct core semantics is given in Algorithm 13. Suppose that there are n relations in an RDB for an l-keyword query. The first reduction phase is in lines 1-7. The second third reduction phases are done in a for-loop lines 814 in which the second reduction phase is line 9 and the third reduction phase is in lines 10-14. Lines 15-17 are similar as done in DC-Naive to compute communities using Pairi 1 i l and 5 relation. For the first reduction it computes Gj i for every keyword ki and every relation Rj separately by calling a procedure PairRoot Algorithm 14 . PairRoot is designed in a similar fashion to Pair . The main difference is that PairRoot computes tuples t that are in shortest distance to a virtual node keyword or center within Dmax. Take keyword-nodes as an example. The shortest distance to a tuple containing a keyword is more important than which tuple contains a keyword. Therefore we only maintain the shortest distance line 9 in Algorithm 14 . PairRoot returns a collection of Gj i for a given keyword ki for 1 j n. Note that G J j i Gj i. In lines 3-4 it projects Rj using semijoin Rj i Rj K Gj i .Here Rj i C Rj is a set of tuples that are within Dmax from a virtual keyword-node ki .Note that Y Un 1 Yj Xj C Rj is a set of centers in relation Rj line 7 . In line 9 starting from all center nodes Xi Xn it computes Wj i for keyword ki for 1 j n. Note that Wi Jj 1 Wj i. In lines 10-14 it further projects Rj out of Rj i for a keyword ki for 1 j n.
đang nạp các trang xem trước