LPBoost

LPBoost is an algorithm for learning such a classification function, given a set of training examples with known class labels.

LPBoost is a machine learning technique especially suited for joint classification and feature selection in structured domains.

As in all boosting classifiers, the final classification function is of the form where

may be just a little bit better than random, but the resulting linear combination of many weak classifiers can perform very well.

Early boosting methods, such as AdaBoost do not have this property and converge slower.

be the possibly infinite set of weak classifiers, also termed hypotheses.

One way to write down the problem LPBoost solves is as a linear program with infinitely many variables.

The primal linear program of LPBoost, optimizing over the non-negative weight vector

: their one-norm is penalized in the objective function by a constant factor

, which—if small enough—always leads to a primal feasible linear program.

When the above linear program was first written down in early publications about boosting methods it was disregarded as intractable due to the large number of variables

Only later it was discovered that such linear programs can indeed be solved efficiently using the classic technique of column generation.

In a linear program a column corresponds to a primal variable.

Column generation is a technique to solve large linear programs.

It typically works in a restricted problem, dealing only with a subset of variables.

By cleverly choosing the columns to generate the problem can be solved such that while still guaranteeing the obtained solution to be optimal for the original full problem, only a small fraction of columns has to be created.

For linear programs the optimal value of the primal and dual problem are equal.

For the above primal and dual problems, the optimal value is equal to the negative 'soft margin'.

The soft margin is the size of the margin separating positive from negative training instances minus positive slack variables that carry penalties for margin-violating samples.

Thus, the soft margin may be positive although not all samples are linearly separated by the classification function.

Consider a subset of the satisfied constraints in the dual problem.

For any finite subset we can solve the linear program and thus satisfy all constraints.

maximizing the left hand side of the dual constraint.

is set to a small positive value in order obtain a good solution quickly.

In practise however, LPBoost is known to converge quickly, often faster than other formulations.

LPBoost is an ensemble learning method and thus does not dictate the choice of base learners, the space of hypotheses

Demiriz et al. showed that under mild assumptions, any base learner can be used.

If the base learners are particularly simple, they are often referred to as decision stumps.

The number of base learners commonly used with Boosting in the literature is large.

, a base learner could be a linear soft margin support vector machine.