Sublinear-time approximation algorithms for clustering via random sampling
Sohler, Christian
Czumaj, Artur
We present a novel analysis of a random sampling approach for four clustering problems in metric spaces: k-median, k-means, min-sum k-clustering, and balanced k-median. For all these problems, we consider the following simple sampling scheme: select a small sample set of input points uniformly at random and then run some approximation algorithm on this sample set to compute an approximation of the best possible clustering of this set. Our main technical contribution is a significantly strengthened analysis of the approximation guarantee by this scheme for the clustering problems.The main motivation behind our analyses was to design sublinear-time algorithms for clustering problems. Our second contribution is the development of new approximation algorithms for the aforementioned clustering problems. Using our random sampling approach, we obtain for these problems the first time approximation algorithms that have running time independent of the input size, and depending on k and the diameter of the metric space only. © 2006 Wiley Periodicals, Inc. Random Struct. Alg., 2007A preliminary extended abstract of this work appeared in Proceedings of the 31st Annual International Colloquium on Automata, Languages and Programming (ICALP), pp. 396407, 2004.
2007
info:eu-repo/semantics/article
doc-type:article
text
https://ris.uni-paderborn.de/record/18665
Sohler C, Czumaj A. Sublinear-time approximation algorithms for clustering via random sampling. <i>Random Structures & Algorithms</i>. 2007;30(1-2):226-- 256.
eng
info:eu-repo/semantics/closedAccess