Kanade–Lucas–Tomasi feature tracker

It is proposed mainly for the purpose of dealing with the problem that traditional image registration techniques are generally costly.

KLT makes use of spatial intensity information to direct the search for the position that yields the best match.

It is faster than traditional techniques for examining far fewer potential matches between the images.

The traditional image registration problem can be characterized as follows: Given two functions

: The KLT feature tracker is based on two papers: In the first paper, Lucas and Kanade[1] developed the idea of a local search using gradients weighted by an approximation to the second derivative of the image.

The average can be further improved by weighting the contribution of each term to it, which is inversely proportional to an estimate of

For the purpose of facilitating the expression, a weighting function is defined:

The procedure is applied repeatedly, yielding a type of Newton–Raphson iteration.

The derivation above cannot be generalized well to two dimensions for the 2-D linear approximation occurs differently.

This can be corrected by applying the linear approximation in the form:

This is basically the same as the 1-D case, except for the fact that the weighting function

To evaluate the performance of the algorithm, we are naturally curious about under what conditions and how fast the sequence of

, i.e. for initial misregistrations as large as one-half wavelength.

The range of convergence can be improved by suppressing high spatial frequencies in the image, which could be achieved by smoothing the image, that will also undesirably suppress small details of it.

Since lowpass-filtered images can be sampled at lower resolution with no loss of information, a coarse-to-fine strategy is adopted.

A low-resolution smoothed version of the image can be used to obtain an approximate match.

falls off to zero as the displacement approaches one-half wavelength.

The implementation requires the calculation of the weighted sums of the quantities

The method can also be extended to take into account registration based on more complex transformations, such as rotation, scaling, and shearing, by considering

The approximation can be used similarly to find the error expression, which becomes quadratic in the quantities to be minimized with respect to.

After figuring out the error expression, differentiate it with respect to the quantities to be minimized and set the results zero, yielding a set of linear equations, then solve them.

Combining this expression with the general linear transformation registration problem:

In the second paper Tomasi and Kanade[2] used the same basic method for finding the registration due to the translation but improved the technique by tracking features that are suitable for the tracking algorithm.

The proposed features would be selected if both the eigenvalues of the gradient matrix were larger than some threshold.

A local patch is considered a good feature to track if both of the two eigenvalues (

A tracking method based on these two papers is generally considered a KLT tracker.

In a third paper, Shi and Tomasi[3] proposed an additional stage of verifying that features were tracked correctly.

If the affine compensated image is too dissimilar the feature is dropped.

The reasoning is that between consecutive frames a translation is a sufficient model for tracking but due to more complex motion, perspective effects, etc.

Using a similar derivation as for the KLT, Shi and Tomasi showed that the search can be performed using the formula