Step detection

The step detection problem occurs in multiple scientific and engineering contexts, for example in statistical process control[1] (the control chart being the most directly related method), in exploration geophysics (where the problem is to segment a well-log recording into stratigraphic zones[2]), in genetics (the problem of separating microarray data into similar copy-number regimes[3]), and in biophysics (detecting state transitions in a molecular machine as recorded in time-position traces[4]).

For 2D signals, the related problem of edge detection has been studied intensively for image processing.

[5] When the step detection must be performed as and when the data arrives, then online algorithms are usually used,[6] and it becomes a special case of sequential analysis.

Such algorithms include the classical CUSUM method applied to changes in mean.

[7] By contrast, offline algorithms are applied to the data potentially long after it has been received.

Most offline algorithms for step detection in digital data can be categorised as top-down, bottom-up, sliding window, or global methods.

These algorithms start with the assumption that there are no steps and introduce possible candidate steps one at a time, testing each candidate to find the one that minimizes some criteria (such as the least-squares fit of the estimated, underlying piecewise constant signal).

An example is the stepwise jump placement algorithm, first studied in geophysical problems,[2] that has found recent uses in modern biophysics.

[8] Bottom-up algorithms take the "opposite" approach to top-down methods, first assuming that there is a step in between every sample in the digital signal, and then successively merging steps based on some criteria tested for every candidate merge.

The evidence for a step is tested by statistical procedures, for example, by use of the two-sample Student's t-test.

Filters such as these attempt to remove the noise whilst preserving the abrupt steps.

Because steps and (independent) noise have theoretically infinite bandwidth and so overlap in the Fourier basis, signal processing approaches to step detection generally do not use classical smoothing techniques such as the low pass filter.

For this reason, step detection can be profitably viewed as the problem of recovering a piecewise constant signal corrupted by noise.

Many algorithms for step detection are therefore best understood as either 0-degree spline fitting, or level set recovery, methods.

These techniques are best understood as methods for finding a level set description of the underlying piecewise constant signal.

Many algorithms explicitly fit 0-degree splines to the noisy signal in order to detect steps (including stepwise jump placement methods[2][8]), but there are other popular algorithms that can also be seen to be spline fitting methods after some transformation, for example total variation denoising.

The goal is to minimize H[m] with respect to the output signal m. The form of the function

For example, choosing: where I(S) = 0 if the condition S is false, and one otherwise, obtains the total variation denoising algorithm with regularization parameter

Similarly: leads to the mean shift algorithm, when using an adaptive step size Euler integrator initialized with the input signal x.

Yet another special case is: specifying a group of algorithms that attempt to greedily fit 0-degree splines to the signal.

[13] A classical variational method for step detection is the Potts model.

there are fast algorithms which give an exact solution of the Potts problem in

Examples of signals that may contain steps corrupted by noise. (a) DNA copy-number ratios from microarray data, (b) cosmic ray intensity from a neutron monitor , (c) rotation speed against time of R. Sphaeroides flagellar motor , and (d) red pixel intensity from a single scan line of a digital image.