tailieunhanh - Data Mining and Knowledge Discovery Handbook, 2 Edition part 24
Data Mining and Knowledge Discovery Handbook, 2 Edition part 24. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 210 Richard A. Berk under the broad rubric of data mining. The coverage is intended to be broad rather than deep. Readers are encouraged to consult the references cited. Some Definitions There are almost as many definitions of Data Mining as there are treatises on the subject Sutton and Barto 1999 Cristianini and Shawe-Taylor 2000 Witten and Frank 2000 Hand et al. 2001 Hastie et al. 2001 Breiman 2001b Dasu and Johnson 2003 and associated with Data Mining are a variety of names statistical learning machine learning reinforcement learning algorithmic modeling and others. By Data Mining I mean to emphasize the following. The broad definition of regression analysis applies. Thus the goal is to examine y X for a response y and a set of predictors X with the values of X treated as fixed. There is no need to commit to any particular feature of y X but emphasis will nevertheless be placed on the conditional mean y X. This is the feature of y X that has to date drawn the most attention. 1 Within the context of regression analysis now consider a given a data set with N observations a single predictor x and a single value of x x0. The fitted value for y0 at x0 can be written as N 5h X S0 jyj j 1 where S is an N by N matrix of weights the subscript 0 represents the row corresponding to the case whose value of y is to be constructed and the subscript j represents the column in which the weight is found. That is the fitted value y0 at x0 is linear combination of all N values of y with the weights determined by S0j. If beyond description estimation is the goal one has a linear estimator of y x. In practice the weights decline with distance from x0 sometimes abruptly as in a step function so that many of the values in S0 are often In a regression context S0 is constructed from a function f x that replaces x with transformations of x. Then we often require that M f x X Pmhm x m 1 1 In much of what follows I use the framework presented in Hastie et al. 2001 . .
đang nạp các trang xem trước