2.5.1
k-means
Summary
- k-means follows an intuitive rule—group nearby points together—by repeatedly updating cluster representatives (centroids) until assignments stabilise.
- The objective function is the within-cluster sum of squares (WCSS), i.e. the squared distance between each sample and its assigned centroid.
- With
scikit-learn’sKMeansyou can visualise convergence, experiment with initialisation schemes, and inspect how assignments change. - Choosing \(k\) typically involves diagnostics such as the elbow method or silhouette scores, balanced with domain knowledge.
Intuition #
This method should be interpreted through its assumptions, data conditions, and how parameter choices affect generalization.