tailieunhanh - Data Mining and Knowledge Discovery Handbook, 2 Edition part 68
Data Mining and Knowledge Discovery Handbook, 2 Edition part 68. Knowledge Discovery demonstrates intelligent computing at its best, and is the most desirable and interesting end-product of Information Technology. To be able to discover and to extract knowledge from data is a task that many researchers and practitioners are endeavoring to accomplish. There is a lot of hidden knowledge waiting to be discovered – this is the challenge created by today’s abundance of data. Data Mining and Knowledge Discovery Handbook, 2nd Edition organizes the most current concepts, theories, standards, methodologies, trends, challenges and applications of data mining (DM) and knowledge discovery. | 650 Paolo Giudici Table . Calculations for the threshold chart cutoff accuracy model A Freq. accuracy model B Freq accuracy model C Freq. 95 0 1 0 1 0 1 90 0 1 0 1 0 1 85 0 1 0 1 0 1 80 0 1 0 1 0 1 75 0 1 0 1 0 1 70 0 1 0 1 0 1 65 0 1 0 1 0 1 60 0 1 0 1 0 2 55 0 2 0 1 0 2 50 6 0 1 0 2 45 7 0 2 0 2 40 9 0 4 0 2 35 18 0 8 0 2 30 28 26 0 8 25 41 49 0 18 20 57 51 37 15 102 101 56 10 180 154 148 5 396 418 325 Fig. . Threshold charts of the models of which 5 . 83 are bad and 95 . 1556 are good . Looking at model A and considering a cut-off level of 5 notice that the model classifies as bad 396 enterprises. Clearly this figure is higher than the actual number of bad enterprises and consequently the accuracy rate of the model will be low. Indeed of the 396 enterprises estimated as bad only 45 are effectively such and this leads to an accuracy rate of for the model. Model A reaches its maximum accuracy for cut off equal to 40 and 50 . Similar conclusions can be drawn for the other two models. To summarize from the Response Threshold Chart we can state that for the examined dataset For low levels of the cut-off . until 15 the highest accuracy rates are those of Reg-3 Model C 32 Data Mining Model Comparison 651 For higher levels of the cut-off between 20 and 55 model A shows a greater accuracy in predicting the occurrence of default bad situations. In the light of the previous considerations it seems natural to ask which of the three is actually the best model. Indeed this question does not have a unique answer the solution depends on the cut-off level retained more opportune to fix in relationship with the business problem at hand. In our case being the default a rare event a low cut-off is .
đang nạp các trang xem trước