tailieunhanh - Data Mining and Knowledge Discovery Handbook, 2 Edition part 34
Data Mining and Knowledge Discovery Handbook, 2 Edition part 34. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 310 Frank Heppner J-measure compares the a priori distribution of X binary variable either the antecedent holds X x or not X x with the a posteriori distribution of X given that Y y. The relative information j X Y y P X Z Y y logJF Xp XZY y ze x x P X z yields the instantaneous information that Y y provides about X j is also known as the Kullbach-Leibler distance . When applying the rule multiple times on average we have the information J X Y y P Y y j X Y y which is the J-value of the rule and is bounded by bit. The drawback is however that highly infrequent rules do not carry much information on average due to the factor P Y y such that highly interesting but rarely occurring associations may not appear under the top-ranked rules. Other measures are conviction a directed asymmetric lift Brin et al. 1997B certainty factors from MYCIN Berzal et al. 2001 correlation coefficients from statistics Tan and Kumar 2002 Laplace or Gini from rule induction Clark and Boswell 1991 or decision tree induction Breiman 1996 . For a comparison of various measures of interestingness the reader is referred to Hilderman and Hamilton 2001 where also general properties rule measures should have are discussed. In Ba-yardo and Agrawal 1999 it is outlined that given a fixed consequent the ordering of rules obtained from confidence is identical to those obtained by lift or conviction which is further generalized in Bayardo et al. 1999 . Interactive or Knowledge-Based Filtering Whatever the rule evaluation measure may propose the final judgment about the interestingness and usefulness of a rule is made by the human expert or user. For instance many measures consistently return those rules as most interesting that consists of a single item in the consequent because in this case confidence is maximized see Section . But the user may be interested in different items or item combinations in the consequent therefore the subjective interestingness of these rules may be low in .
đang nạp các trang xem trước