Maximally stable extremal regions

In computer vision, maximally stable extremal regions (MSER) technique is used as a method of blob detection in images.

This method of extracting a comprehensive number of corresponding image elements contributes to the wide-baseline matching, and it has led to better stereo matching and object recognition algorithms.

In this form we can use a notion of a threshold intensity value which separates the region and its boundary.

The equation checks for regions that remain stable over a certain number of thresholds.

A maximally stable extremal region is found when size of one of these black areas is the same (or near the same) than in previous image.

In that sense, the concept of MSER is linked to the one of component tree of the image.

[2] The component tree indeed provide an easy way for implementing MSER.

Over a large range of thresholds, the local binarization is stable in certain regions, and have the properties listed below.

MSER consistently resulted in the highest score through many tests, proving it to be a reliable region detector.

After sorting, pixels are marked in the image, and the list of growing and merging connected components and their areas is maintained using the union-find algorithm.

During this process, the area of each connected component as a function of intensity is stored producing a data structure.

In the extremal regions, the 'maximally stable' ones are those corresponding to thresholds where the relative area change as a function of relative change of threshold is at a local minimum, i.e. the MSER are the parts of the image where local binarization is stable over a large range of thresholds.

Efficient (quasi-linear whatever the range of the weights) algorithms for computing it do exist.

[3] More recently, Nister and Stewenius have proposed a truly (if the weight are small integers) worst-case

Salembier et al.[7] The purpose of this algorithm is to match MSERs to establish correspondence points between images.

Unstable ones or those on non-planar surfaces or discontinuities are called 'corrupted measurements'.

By applying RANSAC to the centers of gravity of the regions, a rough epipolar geometry can be computed.

The regions are then filtered, and the ones with correlation of their transformed images above a threshold are chosen.

RANSAC is applied again with a more narrow threshold, and the final epipolar geometry is estimated by the eight-point algorithm.

This algorithm can be tested here (Epipolar or homography geometry constrained matches): WBS Image Matcher The MSER algorithm has been used in text detection by Chen by combining MSER with Canny edges.

Canny edges are used to help cope with the weakness of MSER to blur.

MSER is first applied to the image in question to determine the character regions.

To enhance the MSER regions any pixels outside the boundaries formed by Canny edges are removed.

The separation of the later provided by the edges greatly increase the usability of MSER in the extraction of blurred text.

[8] An alternative use of MSER in text detection is the work by Shi using a graph model.

This method again applies MSER to the image to generate preliminary regions.

One cost function is to relate the distance from the node to the foreground and background.

[9] To enable text detection in a general scene, Neumann uses the MSER algorithm in a variety of projections.

In addition to the greyscale intensity projection, he uses the red, blue, and green color channels to detect text regions that are color distinct but not necessarily distinct in greyscale intensity.

This method allows for detection of more text than solely using the MSER+ and MSER- functions discussed above.