Communication Patterns in Cloud Haskell (Part 4)
Posted on October 15, 2012K-Means
In Part 3 of this series we showed how to write a simple distributed implementation of Map-Reduce using Cloud Haskell. In this final part of the series we will explain the K-Means algorithm and show how it can be implemented in terms of Map-Reduce.
K-Means is an algorithm to partition a set of points into n clusters. The algorithm iterates the following two steps for a fixed number of times (or until convergence):
- Given a set of points and n cluster centres, associate each point with the cluster centre it is nearest to.
- Compute the centre of each new cluster.