Contextual image classification

Contextual image classification, a topic of pattern recognition in computer vision, is an approach of classification based on contextual information in images.

"Contextual" means this approach is focusing on the relationship of the nearby pixels, which is also called neighbourhood.

The goal of this approach is to classify the images by using the contextual information.

Similar as processing language, a single word may have multiple meanings unless the context is provided, and the patterns within the sentences are the only informative segments we care about.

However, if we increase the contextual of the image, then it makes more sense to recognize.

During the procedure of segmentation, the methods which do not use the contextual information are sensitive to noise and variations, thus the result of segmentation will contain a great deal of misclassified regions, and often these regions are small (e.g., one pixel).

Compared to other techniques, this approach is robust to noise and substantial variations for it takes the continuity of the segments into account.

This approach is very effective against small regions caused by noise.

This is a two-stage classification process: Instead of using single pixels, the neighbour pixels can be merged into homogeneous regions benefiting from contextual information.

The original spectral data can be enriched by adding the contextual information carried by the neighbour pixels, or even replaced in some occasions.

This kind of pre-processing methods are widely used in textured image recognition.

The typical approaches include mean values, variances, texture description, etc.

Contextual classification of image data is based on the Bayes minimum error classifier (also known as a naive Bayes classifier).

The calculation: Apply the minimum error classification on a pixel

The basic steps of contextual image classification: The template matching is a "brute force" implementation of this approach.

It keeps an entire templates list during the whole process and the number of combinations is extremely high.

The pixels in an image can be recognised as a set of random variables, then use the lower order Markov chain to find the relationship among the pixels.

The image is treated as a virtual line, and the method uses conditional probability.

The lower-order Markov chain and Hilbert space-filling curves mentioned above are treating the image as a line structure.

The Markov meshes however will take the two dimensional information into account.