Count sketch

Count sketch is a type of dimensionality reduction that is particularly efficient in statistics, machine learning and algorithms.

[1][2] It was invented by Moses Charikar, Kevin Chen and Martin Farach-Colton[3] in an effort to speed up the AMS Sketch by Alon, Matias and Szegedy for approximating the frequency moments of streams[4] (these calculations require counting of the number of occurrences for the distinct elements of the stream).

The sketch is nearly identical[citation needed] to the Feature hashing algorithm by John Moody,[5] but differs in its use of hash functions with low dependence, which makes it more practical.

In order to still have a high probability of success, the median trick is used to aggregate multiple count sketches, rather than the mean.

These properties allow use for explicit kernel methods, bilinear pooling in neural networks and is a cornerstone in many numerical linear algebra algorithms.

[6] The inventors of this data structure offer the following iterative explanation of its operation:[3] 1.

(to be defined later) independently choose

random hash functions

It is necessary that the hash families from which

are chosen be pairwise independent.

in the stream, add

At the end of this process, one has

where To estimate the count of

s one computes the following value: The values

are unbiased estimates of how many times

has appeared in the stream.

is the length of the stream and

off from the true value, with probability

Alternatively Count-Sketch can be seen as a linear mapping with a non-linear reconstruction function.

matrices, defined by for

This gives the same guarantees as stated above, if we take

The count sketch projection of the outer product of two vectors is equivalent to the convolution of two component count sketches.

The count sketch computes a vector convolution

are independent count sketch matrices.

Pham and Pagh[8] show that this equals

– a count sketch

of the outer product of vectors, where

denotes Kronecker product.

The fast Fourier transform can be used to do fast convolution of count sketches.

By using the face-splitting product[9][10][11] such structures can be computed much faster than normal matrices.