Harris affine region detector

In the fields of computer vision and image analysis, the Harris affine region detector belongs to the category of feature detection.

Feature detection is a preprocessing step of several algorithms that rely on identifying characteristic points or interest points so to make correspondences between images, recognize textures, categorize objects or build panoramas.

These affine-invariant detectors should be capable of identifying similar regions in images taken from different viewpoints that are related by a simple geometric transformation: scaling, rotation and shearing.

[1] Do not dwell too much on these two naming conventions; the important thing to understand is that the design of these interest points will make them compatible across images taken from several viewpoints.

[2] Earlier works in this direction include use of affine shape adaptation by Lindeberg and Garding for computing affine invariant image descriptors and in this way reducing the influence of perspective image deformations,[3] the use affine adapted feature points for wide baseline matching by Baumberg[4] and the first use of scale invariant feature points by Lindeberg;[5][6][7] for an overview of the theoretical background.

This can alternatively be formulated by examining the changes of intensity due to shifts in a local window.

Around a corner point, the image intensity will change greatly when the window is shifted in an arbitrary direction.

Following this intuition and through a clever decomposition, the Harris detector uses the second moment matrix as the basis of its corner decisions.

, has also been called the autocorrelation matrix and has values closely related to the derivatives of image intensity.

can be uniform, but is more typically an isotropic, circular Gaussian, that acts to average in a local region while weighting those values near the center more heavily.

matrix describes the shape of the autocorrelation measure as due to shifts in window location.

, then these values will provide a quantitative description of how the autocorrelation measure changes in space: its principal curvatures.

[8] Rather than extracting these eigenvalues using methods like singular value decomposition, the Harris measure based on the trace and determinant is used: where

Thus, corner points are identified as local maxima of the Harris measure that are above a specified threshold.

[10] However, the points are not scale invariant and thus the second-moment matrix must be modified to reflect a scale-invariant property.

are the derivatives in their respective direction applied to the smoothed image and calculated using a Gaussian kernel with scale

parameter determines the current scale at which the Harris corner points are detected.

An iterative algorithm based on Lindeberg (1998) both spatially localizes the corner points and selects the characteristic scale.

When the stopping criterion is met, the found points represent those that maximize the LoG across scales (scale selection) and maximize the Harris corner measure in a local neighborhood (spatial selection).

In order to be invariant to arbitrary affine transformations (and viewpoints), the mathematical framework must be revisited.

In fact, the eigenvectors and eigenvalues of the covariance matrix define the rotation and size of the ellipsoid.

Thus we can easily see that this representation allows us to completely define an arbitrary elliptical affine region over which we want to integrate or differentiate.

The second-moment matrix is computed in this normalized reference frame and should have an isotropic measure close to one at the final iteration.

The computational complexity of the Harris-affine detector is broken into two parts: initial point detection and affine region normalization.

The affine region normalization algorithm automatically detects the scale and estimates the shape adaptation matrix,

from a set of factors, the sped-up algorithm chooses the scale to be constant across iterations and points:

Although this reduction in search space might decrease the complexity, this change can severely effect the convergence of the

After finishing identifying all interest points, the algorithm accounts for duplicates by comparing the spatial coordinates (

Regions with an overlap error as high as 50% are viable detectors to be matched with a good descriptor.

A brief summary of the results of Mikolajczyk et al. (2005) follow; see A comparison of affine region detectors for a more quantitative analysis.