M. Goldstein, "K-Nearest Neighbor classification" IEEE Trans. Inform. Th. 18 (1972) 627-630.
Nearest Neighbor
The Nearest Neighbor (NN) classification method assigns to a sample X the class Yi of the training sample Xi that is closest to X, in a suitable metric, usually the Euclidean norm or Mahalanobis.
K Nearest Neighbor
The K-NN classification method is used in binary classification tasks and assigns to a sample X the class that appears the most among the k closest training samples. k is assumed to be an odd integer. This is equivalent to assign the class for which the j-th distance training sample is closer, where j = (k+1)/2.
We denote X'i the training samples of the first class and X"i those of the second class. Given a test sample X let H'(z)=Prob( d(X',X)<z ) and H"(z) analogously. Let also X'j(X) denotes the j-th closest X' to X, and similarly for X"j(X). We want to estimate the probability
The event can be decomposed into disjoint events
for k=j, j+1, ..., n
The probability of these events are
where the probability in the integral is (n-1)!/(j-1)!(n-j)! H'j-1(1-H')n-j.
From this formul ait is possible to estimate an upper bound to the probability of misclassification for the K-NN. ...
Marco Corvi - Page hosted by
geocities.com.