tailieunhanh - Data Analysis Machine Learning and Applications Episode 1 Part 4

Tham khảo tài liệu 'data analysis machine learning and applications episode 1 part 4', kỹ thuật - công nghệ, cơ khí - chế tạo máy phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả | 22 Eugeniusz Gatnar 1 A. KW NM L xi M - L xi . 11 i 1 Also Dietterich 2000 proposed the measure to assess the level of agreement between classifiers. It is the kappa statistics K MEN 1 L xi _M-1 x 1 N M -1 P 1 - P Hansen and Salamon 1990 introduced the measure of difficulty 0. It is simply the variance of the random variable Z L x M 0 Var Z . 13 Two measures of diversity have been proposed by Partridge and Krzanowski 1997 for evaluation of the software diversity. The first one is the generalized diversity measure p 2 GD 1 - 14 p 1 where p k is the probability that k randomly chosen classifiers will fail on the observation x. The second measure is named coincident failure diversity 0 where p0 1 CFD 1 _M M-m i 15 I -0 EM 1 Pm where P0 1 where pm is the probability that exactly m out of M classifiers will fail on an observation x. 4 Combination rules Once we have produced the set of individual classifiers of desired level of diversity we combine their predictions to amplify their correct decisions and cancel out the wrong ones. The combination function F in 1 depends on the type of the classifier outputs. There are three different forms of classifier output. The classifier can produce a single class label abstract level rank the class labels according to their posterior probabilities rank level or produce a vector of posterior probabilities for classes measurement level . Majority voting is the most popular combination rule for class labels1 M . i I Cm x Ij 16 m 1 1 In the R statistical environment we obtain class labels using the command predict . type class . Fusion of Multiple Statistical Classifiers 23 It can be proved that it is optimal if the number of classifiers is odd they have the same accuracy and the classifier s outputs are independent. If we have evidence that certain models are more accurate than others weighing the individual predictions may improve the overall performance of the ensemble. Behavior Knowledge Space developed by Huang and Suen 1995 uses

TỪ KHÓA LIÊN QUAN