tailieunhanh - Data Mining and Knowledge Discovery Handbook, 2 Edition part 27
Data Mining and Knowledge Discovery Handbook, 2 Edition part 27. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 240 Armin Shmilovici A tube with radius e is fitted to the data and a regression function that generalizes well is then found by controlling both the regression capacity via w and the loss function. One possible realization called C-SVR of a is minimizing the following objective function n X 1 llwll2 C L yi - f x e w i i i The regularization constant C 0 determines the trade-off between the empirical error and the complexity term. Fig. . In SV regression a tube with radius e is fitted to the data. The optimization determines a trade-off between model complexity and points lying outside of the tube. Figure taken from Smola and Scholkopf 2004 . Generalization to kernel-based regression estimation is carried out in complete analogy with the classification problem. Introducing Lagrange multipliers and choosing a-priory the regularization constants C e one arrives at a dual quadratic optimization problem. The support vectors and the support values of the solution define the following regression function n f x L aiK x Xi b i 1 There are degrees of freedom for constructing SVR such as how to penalize or regularize different parts of the vector how to use the kernel trick and the loss function to use. For example in the v-SVR algorithm implemented in LIBSVM Chang and Lin 2001 one specifies an upper bound 0 v 1 on the fraction of points allowed to be outside the tube asymptotically the number of Support Vectors . For a-priory chosen constants C v the dual quadratic optimization problem is as follows n n max V Cfv 1 V Csy rr-Vrv c y i 19 a a L ai ai yi 2 L ai ai aj a K xi xj i 1 i j 1 12 Support Vector Machines 241 af ai Cv Subject to 0 ai ai C i - n af - ai Cv i 1 and the regression solution is expressed as n f x af- ai K x x - b i 1 SVM-like Models The power of SVM comes from the kernel representation that allows a non-linear mapping of input space to a higher dimensional feature space. However the resulting quadratic programming .
đang nạp các trang xem trước