A Theoretical Analysis of the Fuzzy K-Means Problem
Blömer, Johannes
Brauer, Sascha
Bujna, Kathrin
unsolvability by radicals
clustering
fuzzy k-means
probabilistic method
approximation algorithms
randomized algorithms
One of the most popular fuzzy clustering techniques is the fuzzy K-means algorithm (also known as fuzzy-c-means or FCM algorithm). In contrast to the K-means and K-median problem, the underlying fuzzy K-means problem has not been studied from a theoretical point of view. In particular, there are no algorithms with approximation guarantees similar to the famous K-means++ algorithm known for the fuzzy K-means problem. This work initiates the study of the fuzzy K-means problem from an algorithmic and complexity theoretic perspective. We show that optimal solutions for the fuzzy K-means problem cannot, in general, be expressed by radicals over the input points. Surprisingly, this already holds for simple inputs in one-dimensional space. Hence, one cannot expect to compute optimal solutions exactly. We give the first (1+eps)-approximation algorithms for the fuzzy K-means problem. First, we present a deterministic approximation algorithm whose runtime is polynomial in N and linear in the dimension D of the input set, given that K is constant, i.e. a polynomial time approximation scheme (PTAS) for fixed K. We achieve this result by showing that for each soft clustering there exists a hard clustering with similar properties. Second, by using techniques known from coreset constructions for the K-means problem, we develop a deterministic approximation algorithm that runs in time almost linear in N but exponential in the dimension D. We complement these results with a randomized algorithm which imposes some natural restrictions on the sought solution and whose runtime is comparable to some of the most efficient approximation algorithms for K-means, i.e. linear in the number of points and the dimension, but exponential in the number of clusters.
IEEE
2016
info:eu-repo/semantics/conferenceObject
doc-type:conferenceObject
text
https://ris.uni-paderborn.de/record/2367
Blömer J, Brauer S, Bujna K. A Theoretical Analysis of the Fuzzy K-Means Problem. In: <i>2016 IEEE 16th International Conference on Data Mining (ICDM)</i>. IEEE; 2016:805-810. doi:<a href="https://doi.org/10.1109/icdm.2016.0094">10.1109/icdm.2016.0094</a>
eng
info:eu-repo/semantics/altIdentifier/doi/10.1109/icdm.2016.0094
info:eu-repo/semantics/altIdentifier/isbn/9781509054732
info:eu-repo/semantics/closedAccess