Growth function

The growth function, also called the shatter coefficient or the shattering number, measures the richness of a set family or class of function.

It is especially used in the context of statistical learning theory, where it is used to study properties of statistical learning methods.

The term 'growth function' was coined by Vapnik and Chervonenkis in their 1968 paper, where they also proved many of its properties.

[1] It is a basic concept in machine learning.

Their intersection is defined as the following set-family: The intersection-size (also called the index) of

, i.e.: The growth function measures the size of

be a hypothesis-class (a set of binary functions) and

is the set of binary functions on

:[3]: 45 The growth function measures the size of

The domain is the real line

contains all the half-lines (rays) from a given number to positive infinity, i.e., all sets of the form

real numbers, the intersection

, the set containing the two largest elements of

, where Comp is the number of components in a partitioning of an n-dimensional space by m hyperplanes.

contains all the real intervals, i.e., all sets of the form

The main property that makes the growth function interesting is that it can be either polynomial or exponential - nothing in-between.

Therefore, the growth function is mainly interesting when

: I.e, the growth function has an exponential upper-bound.

The growth function can be regarded as a refinement of the concept of VC dimension.

The VC dimension only tells us whether

, while the growth function tells us exactly how

Another connection between the growth function and the VC dimension is given by the Sauer–Shelah lemma:[3]: 49 In particular, This upper bound is tight, i.e., for all

such that:[2]: 56 While the growth-function is related to the maximum intersection-size, the entropy is related to the average intersection size:[1]: 272–273 The intersection-size has the following property.

Suppose we choose a set

, where each element is chosen at random according to the probability measure

, we compare the following two quantities: We are interested in the difference,

This difference satisfies the following upper bound: which is equivalent to:[1]: Th.2 In words: the probability that for all events in

, the relative-frequency is near the probability, is lower-bounded by an expression that depends on the growth-function of

A corollary of this is that, if the growth function is polynomial in

enjoys uniform convergence in probability.