tailieunhanh - Keyword Search in Databases- P23

Keyword Search in Databases- P23:Conceptually, a database can be viewed as a data graph GD(V ,E), where V represents a set of objects, and E represents a set of connections between objects. In this book, we concentrate on two kinds of databases, a relational database (RDB) and an XML database. In an RDB, an object is a tuple that consists of many attribute values where some attribute values are strings or full-text; there is a connection between two objects if there exists at least one reference from one to the other | . ELCA-BASED SEMANTICS 109 Ui Hi y ui 1 um p. c 1 Fi 1 Figure v and childelcacan v Xu and Papakonstantinou 2008 Algorithm 37 isELCA v ch Input anode v and ch child_elcacan v . Output return true ifv is ELCA false otherwise 1 for i 1 to l do 2 X v 3 for j 1 to ch do 4 x rm x Si 5 if x or pre x pre ch j then 6 break 7 else 8 x nextSibling ch j 9 if j ch 1 then 10 return false if v f rm x Si 11 return true After the first step that we got ELCA_CANs if we can find child_elcacan v efficiently for each ELCA_CAN v then we can find ELCAs in time O S1 ld log S . If we assign each ELCA_CAN u to be the child of its ancestor ELCA_CAN node v with the largest Dewey ID then u corresponds to exactly one node in child_elcacan v and the node in child_elcacan v corresponding to u can be found in O d time by the Dewey ID. In the following we use child _elcacan v to denote the set of ELCA_CAN nodes u which is a descendant of v and there does not exist any node x with v x x x w . child_elcacan v u e elca_can Si Si v x u A x e elca_can Si Sl v x x x u There is an one-to-one correspondence between the two definitions of child_elcacan v . It is easy to see that veelca_can S1 - sl e can O elca_can Si Sl O S1 . 110 4. KEYWORD SEARCH IN XML DATABASES Now the problem becomes how to compute child_elcacan v efficiently for all v e elca_can Si Si . Note that the nodes in elca_can Si Si as computed by Uv1SSj elca_can vi are not sorted in Dewey ID order. Similar to DeweyInvertedList a stack based algorithm is used to compute child_elcacan v but it works on the set elca_can Si Si while DeweyInvertedList works on the set Si U S2 U Si . Each stack entry created for a node vi e Si has the following three components elca_can is elca_can vi CH is child_elcacan vi SIB is the list of ELCA_CANs before elca_can which is used to compute CH IndexedStack Xu and Papakonstantinou 2007 2008 is shown in Algorithm 38. For each node vi e Si it computes elca_canvi elca_can vi line 3 a stack entry en is .

TỪ KHÓA LIÊN QUAN