Sauer–Shelah lemma

In combinatorial mathematics and extremal set theory, the Sauer–Shelah lemma states that every family of sets with small VC dimension consists of a small number of sets.

It is named after Norbert Sauer and Saharon Shelah, who published it independently of each other in 1972.

[1][2] The same result was also published slightly earlier and again independently, by Vladimir Vapnik and Alexey Chervonenkis, after whom the VC dimension is named.

[4][5] Buzaglo et al. call this lemma "one of the most fundamental results on VC-dimension",[4] and it has applications in many areas.

Sauer's motivation was in the combinatorics of set systems,[1] while Shelah's was in model theory[2] and that of Vapnik and Chervonenkis was in statistics.

[3] It has also been applied in discrete geometry[6] and graph theory.

is the largest cardinality of a set shattered by

[6] In terms of these definitions, the Sauer–Shelah lemma states that if the VC dimension of

sets, as expressed using big O notation.

elements, and if the number of sets in the family,

[6] The bound of the lemma is tight: Let the family

[8] A strengthening of the Sauer–Shelah lemma, due to Pajor (1985), states that every finite set family

[11] Pajor's variant of the Sauer–Shelah lemma may be proved by mathematical induction; the proof has variously been credited to Noga Alon[12] or to Ron Aharoni and Ron Holzman.

[11] A different proof of the Sauer–Shelah lemma in its original form, by Péter Frankl and János Pach, is based on linear algebra and the inclusion–exclusion principle.

[6][8] This proof extends to other settings such as families of vector spaces and, more generally, geometric lattices.

[5] The original application of the lemma, by Vapnik and Chervonenkis, was in showing that every probability distribution can be approximated (with respect to a family of events of a given VC dimension) by a finite set of sample points whose cardinality depends only on the VC dimension of the family of events.

In this context, there are two important notions of approximation, both parameterized by a number

of samples, and a probability distribution on

-approximation of the original distribution if the probability of each event with respect to

differs from its original probability by at most

-net but not necessarily vice versa.

Vapnik and Chervonenkis used the lemma to show that set systems of VC dimension

Later authors including Haussler & Welzl (1987)[13] and Komlós, Pach & Woeginger (1992)[14] similarly showed that there always exist

The main idea of the proof of the existence of small

is larger than its median is very small, and the Sauer–Shelah lemma (applied to

) shows that only a small number of distinct events

need to be considered, so by the union bound, with nonzero probability,

-approximations, and the likelihood that a random sample of large enough cardinality has these properties, have important applications in machine learning, in the area of probably approximately correct learning.

[15] In computational geometry, they have been applied to range searching,[13] derandomization,[16] and approximation algorithms.

[17][18] Kozma & Moran (2013) use generalizations of the Sauer–Shelah lemma to prove results in graph theory such as that the number of strong orientations of a given graph is sandwiched between its numbers of connected and 2-edge-connected subgraphs.

Pajor's formulation of the Sauer–Shelah lemma: for every finite family of sets (green) there is another family of equally many sets (blue outlines) such that each set in the second family is shattered by the first family