tailieunhanh - Keyword Search in Databases- P21
Keyword Search in Databases- P21:Conceptually, a database can be viewed as a data graph GD(V ,E), where V represents a set of objects, and E represents a set of connections between objects. In this book, we concentrate on two kinds of databases, a relational database (RDB) and an XML database. In an RDB, an object is a tuple that consists of many attribute values where some attribute values are strings or full-text; there is a connection between two objects if there exists at least one reference from one to the other | . IDENTIFYMEANINGFUL RETURN INFORMATION 99 Figure Sample XML Document Liu and Chen 2008b t M rooted at t with nodes M corresponding to the matches that are considered relevant to Q. Every keyword in Q has at least one match in M. Note that one query result should not be subsumed by another therefore the root nodes should not have ancestor-descendant relationship . t e slca Q . All the t M pairs can be found efficiently by first finding all SLCAs using the algorithms presented in the previous section then assigning each match node to the corresponding SLCA node by a linear scan on all SLCAs and Si Si .In the following we mainly focus on identifying meaningful information based on t M . XSEEK XSeek Liu and Chen 2007 Liu et al. 2009b 2007 is a system that represents the whole subtree rooted at each SLCA node compactly. We illustrate the general idea of XSeek by the five queries in Figure on the XML data shown in Figure . For Qi there is only one keyword Grizzlies it is likely that the user is interested in information about Grizzlies. But by the definition of SLCA only the node Grizzlies is returned which is not informative. Ideally the subtree rooted at 0 team should be returned because this specifies the information that Grizzlies is a team name. Consider Q2 and Q3 many algorithm will return the same subtree. But the user is likely to be interested in information about the player whose name is Gasol and who is a forward in the team for Q2 and the user is interested in a particular piece of information the position of Gasol for Q3. To process Q5 XSeek outputs the name ofplayers and provides a link to its player children which provides information about all the players in the team. 100 4. KEYWORD SEARCH IN XML DATABASES ELEMENT team name players ELEMENT name PCDATA ELEMENT players play ELEMENT player name nationality position Figure Sample XML schema Fragment Qi Grizzlies Ö2 Gasol forward Ö3 Gasol position Ö4 team Grizzlies forward Ö5
đang nạp các trang xem trước