Chapter 2. Learning

Table of Contents

2.1. k-Means clustering algorithm
2.2. Baum-Welch algorithm

2.1. k-Means clustering algorithm

This algorithm first packs the observations in clusters using the general purpose clustering class KMeansCalculator. This class takes elements as inputs and outputs clusters made of those elements. The elements must implement the CentroidFactory interface, meaning that they must be convertible to a Centroid.

A centroid is sort of a mean value of a set of CentroidFactory elements. For example, if the set of elements is the integers values {2,3}, than their centroid could be their arithmetic mean, 2.5.

To use the k-Means learning algorithm, one first instanciates a KMeansLearner object. Remember that the set of observation sequences given to the constructor must also be a set of CentroidFactory objects.

Once this is done, each call to the iterate() function returns a better approximation of a matching HMM. Notice that the first call to this function returns a HMM that does not depend on the temporal dependance of observations (it could be a good starting point for the Baum-Welch algorithm).

The learn() method call the iterate() method until a fix point is reached (i.e. the clusters doesn't change anymore).