Structure tensor

It describes the distribution of the gradient in a specified neighborhood around a point and makes the information invariant to the observing coordinates.

; and w is some fixed "window function" (such as a Gaussian blur), a distribution on two variables.

is an indicator of the degree of anisotropy of the gradient in the window, namely how strongly is it biased towards a particular direction (and its opposite).

The formula is undefined, even in the limit, when the image is constant in the window (

Aligned but oppositely oriented gradient vectors would cancel out in this average, whereas in the structure tensor they are properly added together.

(that is, increasing its variance), one can make the structure tensor more robust in the face of noise, at the cost of diminished spatial resolution.

Using Parseval's identity it is clear that the three real numbers are the second order moments of the power spectrum of

It follows also that if the gradient is represented as a complex number, and is remapped by squaring (i.e. the argument angles of the complex gradient is doubled), then averaging acts as an optimizer in the mapped domain, since it directly delivers both the optimal direction (in double angle representation) and the associated certainty.

, represents the linear symmetry component of the structure tensor containing all directional information (as a rank-1 matrix), whereas the second term represents the balanced body component of the tensor, which lacks any directional information (containing an identity matrix

is the (inner scale) parameter determining the effective frequency range in which the orientation

The elegance of the complex representation stems from that the two components of the structure tensor can be obtained as averages and independently.

can be used in a scale space representation to describe the evidence for presence of unique orientation and the evidence for the alternative hypothesis, the presence of multiple balanced orientations, without computing the eigenvectors and eigenvalues.

A functional, such as squaring the complex numbers have to this date not been shown to exist for structure tensors with dimensions higher than two.

[8] The complex representation of the structure tensor is frequently used in fingerprint analysis to obtain direction maps containing certainties which in turn are used to enhance them, to find the locations of the global (cores and deltas) and local (minutia) singularities, as well as automatically evaluate the quality of the fingerprints.

, summarize the distribution of gradient directions within the neighborhood of p defined by the window

This information can be visualized as an ellipsoid whose semi-axes are equal to the eigenvalues and directed along their corresponding eigenvectors.

This situation occurs, for instance, when p lies on a thin plate-like feature, or on the smooth boundary between two regions with contrasting values.

This situation occurs, for instance, when p lies on a thin line-like feature, or on a sharp corner of the boundary between two regions with contrasting values.

), it means that the gradient directions in the window are more or less evenly distributed, with no marked preference; so that the function

This happens, for instance, when the function has spherical symmetry in the neighborhood of p. In particular, if the ellipsoid degenerates to a point (that is, if the three eigenvalues are zero), it means that

is in contrast to other one-parameter scale-space features an image descriptor that is defined over two scale parameters.

, is needed for determining the amount of pre-smoothing when computing the image gradient

that determines the weights for the region in space over which the components of the outer product of the gradient by itself

If one naively would apply, for example, a box filter, however, then non-desirable artifacts could easily occur.

There are different ways of handling the two-parameter scale variations in this family of image descriptors.

, we obtain a reduced self-similar one-parameter variation, which is frequently used to simplify computational algorithms, for example in corner detection, interest point detection, texture analysis and image matching.

is usually used, with i ranging from 0 to some maximum scale index m. Thus, the discrete scale levels will bear certain similarities to image pyramid, although spatial subsampling may not necessarily be used in order to preserve more accurate data for subsequent processing stages.

[9][13][14][15][16][17][18] The structure tensor also plays a central role in the Lucas-Kanade optical flow algorithm, and in its extensions to estimate affine shape adaptation;[11] where the magnitude of

The tensor has been used for scale space analysis,[7] estimation of local surface orientation from monocular or binocular cues,[12] non-linear fingerprint enhancement,[19] diffusion-based image processing,[20][21][22][23] and several other image processing problems.

To obtain true Galilean invariance, however, also the shape of the spatio-temporal window function needs to be adapted,[25][26] corresponding to the transfer of affine shape adaptation[11] from spatial to spatio-temporal image data.