tailieunhanh - Data Mining and Knowledge Discovery Handbook, 2 Edition part 38
Data Mining and Knowledge Discovery Handbook, 2 Edition part 38. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 350 Jean-Francois Boulicaut and Baptiste Jeudy describe a mining algorithm but rather a pruning technique for non anti-monotonic and non monotonic constraints. Considering a sub-lattice A of 21 the problem is to decide whether this sub-lattice can be pruned. A sub-lattice is characterized by its maximal element M and its minimal element m . the sub-lattice is the collection of all itemsets S such that m C S C M. To prune this sub-lattice one must prove that none of its elements can satisfy the constraint C. To check this the authors introduce the concept of negative witness a negative witness for C in the sub-lattice A is an itemset W such that C W cX e A C X . Therefore if the constraint is not satisfied by the negative witness then the whole sub-lattice can be pruned. Finding witnesses for anti-monotonic or monotonic constraints is easy m is the witness for all anti-monotonic constraints and M for all monotonic ones. The authors then show how to compute efficiently witnesses for various tough constraints. For instance for AVG S a a witness is the set mU i e M a . The authors also gives an algorithm linear in the size of I to compute a witness for the difficult constraint VAR S a where VAR denotes the variance. Ad-hoc Strategies Apart from generic algorithms many algorithms have been designed to cope with specific classes of constraints. We select only two examples. The FIC algorithm Pei et al. 2001 does a depth-first exploration of the itemset lattice. It is very efficient due to its clever data structure a prefix-tree used to store the database. This algorithm can compute the extended theory for a conjunction Cam A Cm A C where C is convertible anti-monotonic or monotonic. A constraint C is convertible anti-monotonic if there exists an order on the items such that if itemsets are written using this order every prefix of an itemset satisfying C satisfies C. For instance AVG S a is convertible anti-monotonic if the items i are ordered by decreasing .
đang nạp các trang xem trước