Segmentation-based object categorization

The image segmentation problem is concerned with partitioning an image into multiple regions according to some homogeneity criterion.

This article is primarily concerned with graph theoretic approaches to image segmentation applying graph partitioning via minimum cut or maximum cut.

Segmentation-based object categorization can be viewed as a specific case of spectral clustering applied to image segmentation.

The set of points in an arbitrary feature space can be represented as a weighted undirected complete graph G = (V, E), where the nodes of the graph are the points in the feature space.

In this context, we can formulate the image segmentation problem as a graph partitioning problem that asks for a partition

, where, according to some measure, the vertices in any set

have high similarity, and the vertices in two different sets

measures the total similarity of vertices in the same part.

However, we can find in polynomial time a cut

of small normalized weight

After some algebraic manipulations, we get: subject to the constraints: Minimizing

To make the problem tractable, we relax the constraints on

The relaxed problem can be solved by solving the generalized eigenvalue problem

The partitioning algorithm: Solving a standard eigenvalue problem for all eigenvectors (using the QR algorithm, for instance) takes

This is impractical for image segmentation applications where

Since only one eigenvector, corresponding to the second smallest generalized eigenvalue, is used by the uncut algorithm, efficiency can be dramatically improved if the solve of the corresponding eigenvalue problem is performed in a matrix-free fashion, i.e., without explicitly manipulating with or even computing the matrix W, as, e.g., in the Lanczos algorithm.

Matrix-free methods require only a function that performs a matrix-vector product for a given vector, on every iteration.

For image segmentation, the matrix W is typically sparse, with a number of nonzero entries

For high-resolution images, the second eigenvalue is often ill-conditioned, leading to slow convergence of iterative eigenvalue solvers, such as the Lanczos algorithm.

Preconditioning is a key technology accelerating the convergence, e.g., in the matrix-free LOBPCG method.

Computing the eigenvector using an optimally preconditioned matrix-free method takes

time, which is the optimal complexity, since the eigenvector has

scikit-learn[3] uses LOBPCG from SciPy with algebraic multigrid preconditioning for solving the eigenvalue problem for the graph Laplacian to perform image segmentation via spectral graph partitioning as first proposed in [4] and actually tested in [5] and.

[6] OBJ CUT[7] is an efficient method that automatically segments an object.

The OBJ CUT method is a generic method, and therefore it is applicable to any object category model.

Given an image D containing an instance of a known object category, e.g. cows, the OBJ CUT algorithm computes a segmentation of the object, that is, it infers a set of labels m. Let m be a set of binary labels, and let

is a shape prior on the labels from a layered pictorial structure (LPS) model).

A unary term consists of the likelihood

based on color, and the unary potential

A pairwise term consists of a prior