Time-series segmentation

A typical application of time-series segmentation is in speaker diarization, in which an audio signal is partitioned into several pieces according to who is speaking at what times.

Algorithms based on change-point detection include sliding windows, bottom-up, and top-down methods.

[1] Probabilistic methods based on hidden Markov models have also proved useful in solving this problem.

[2] It is often the case that a time-series can be represented as a sequence of discrete segments of finite length.

For example, the trajectory of a stock market could be partitioned into regions that lie in between important world events, the input to a handwriting recognition application could be segmented into the various words or letters that it was believed to consist of, or the audio recording of a conference could be divided according to who was speaking when.

In the latter two cases, one may take advantage of the fact that the label assignments of individual segments may repeat themselves (for example, if a person speaks at several separate occasions during a conference) by attempting to cluster the segments according to their distinguishing properties (such as the spectral content of each speaker's voice).

The first involves looking for change points in the time-series: for example, one may assign a segment boundary whenever there is a large jump in the average value of the signal.

is assumed to have been generated as the system transitions among a set of discrete, hidden states

is drawn from an observation (or emission) distribution indexed by the current hidden state, i.e.,

Hidden state sequence and emission distribution parameters can be learned using the Baum-Welch algorithm, which is a variant of expectation maximization applied to HMMs.

More robust parameter-learning methods involve placing hierarchical Dirichlet process priors over the HMM transition matrix.