tailieunhanh - Chapter 3: Nearest neighbor based classifiers

Chapter 3: Nearest neighbor based classifiers is Introduction; Nearest Neighbor algorithm, Variants of the NN algorithm, Data Reduction, Prototype reduction, Z-score normalization, Modified k-Nearest Neighbor algorithm and somethings else. | Chapter 3 Nearest neighbor based classifiers Assoc. Prof. Dr. Duong Tuan Anh Faculty of Computer Science and Engineering, HCMC Univ. of Technology 3/2015 Outline Introduction Nearest Neighbor algorithm Variants of the NN algorithm Data Reduction Prototype reduction 1. Introduction One of the simplest decision procedures that can be used for classification: the nearest neighbor algorithm. The nearest neighbor based classifiers use some or all patterns in the training set to classify a test pattern. These classifiers involve finding the similarity between the test pattern and every pattern in the training set. Lazy learners: do less work when a training pattern is presented and more work when making a classification. NN classifier lazy learner, instance-based learner 2. Nearest Neighbor algorithm The nearest neighbor (NN) algorithm assign to a test pattern the class label of its closest neighbor. Let there be n training patterns, (X1,c1), (X2, c2), ,(Xn, cn), where Xi is of . | Chapter 3 Nearest neighbor based classifiers Assoc. Prof. Dr. Duong Tuan Anh Faculty of Computer Science and Engineering, HCMC Univ. of Technology 3/2015 Outline Introduction Nearest Neighbor algorithm Variants of the NN algorithm Data Reduction Prototype reduction 1. Introduction One of the simplest decision procedures that can be used for classification: the nearest neighbor algorithm. The nearest neighbor based classifiers use some or all patterns in the training set to classify a test pattern. These classifiers involve finding the similarity between the test pattern and every pattern in the training set. Lazy learners: do less work when a training pattern is presented and more work when making a classification. NN classifier lazy learner, instance-based learner 2. Nearest Neighbor algorithm The nearest neighbor (NN) algorithm assign to a test pattern the class label of its closest neighbor. Let there be n training patterns, (X1,c1), (X2, c2), ,(Xn, cn), where Xi is of dimension d and ci is the class label of ith pattern. If P is the test pattern, then if d(P, Xk) = min {d(P, Xi)} i = 1,2,, n pattern P is assigned to the class k associated with Xk. We have to use some distance measure, . Euclidean distance to measure the “closeness” between the test pattern P to some pattern in the training set. The Euclidean distance between two tuples, say, X1 = (x11, x12,,x1n) and X2 = (x21, x22,,x2n) is Example The training set: X1 = ( ,1), X2 = (), X3 = (, ), X4 = (, , 1), X5 = ( ,1), X6 = (, 2), X7 = (), X8 = (), X9 = (, ), X10 = (, ), X11 = (, 2), X12 = (, 2), X13 = ( ,3), X14 = (, ), X15 = (, ), X16 = (, 3), X17 = (, , 3), X18 = (, , 3) +: 1 X: 2 : 3 Figure Example dataset If the test pattern P is (, ), we have to find the distance from P to all the training pattern. Here we use Euclidean distance. .