k-means assumes the variance of the distribution of each attribute (variable) is spherical;
all variables have the same variance;
the prior probability for all k clusters is the same, i.e., each cluster has roughly equal number of observations;
If any one of these 3 assumptions are violated, then k-means will fail.
I could not understand the logic behind this statement. I think the k-means method makes essentially no assumptions, it just minimizes the SSE, so I cannot see the link between minimizing the SSE and those 3 "assumptions".