tailieunhanh - Distance based k-means clustering algorithm for determining number of clusters for high dimensional data