Margin classifier

In machine learning (ML), a margin classifier is a type of classification model which is able to give an associated distance from the decision boundary for each data sample.

The notion of margins is important in several ML classification algorithms, as it can be used to bound the generalization error of these classifiers.

These bounds are frequently shown using the VC dimension.

The generalization error bound in boosting algorithms and support vector machines is particularly prominent.

The margin for an iterative boosting algorithm given a dataset with two classes can be defined as follows: the classifier is given a sample pair

is a space of possible classifiers that predict real values.

This definition may be modified and is not the only way to define the margin for boosting algorithms.

However, only some classifiers utilize information of the margin while learning from a dataset.

Many boosting algorithms rely on the notion of a margin to assign weight to samples.

If a convex loss is utilized (as in AdaBoost or LogitBoost, for instance) then a sample with a higher margin will receive less (or equal) weight than a sample with a lower margin.

This leads the boosting algorithm to focus weight on low-margin samples.

In non-convex algorithms (e.g., BrownBoost), the margin still dictates the weighting of a sample, though the weighting is non-monotone with respect to the margin.

data points, sampled independently at random from a distribution

Assume the VC-dimension of the underlying base classifier is