Vector quantization

Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors.

Developed in the early 1980s by Robert M. Gray, it was originally used for data compression.

Each group is represented by its centroid point, as in k-means and some other clustering algorithms.

Vector quantization is based on the competitive learning paradigm, so it is closely related to the self-organizing map model and to sparse coding models used in deep learning algorithms such as autoencoder.

The simplest training algorithm for vector quantization is:[1] A more sophisticated algorithm reduces the bias in the density matching estimation, and ensures that all points are used, by including an extra sensitivity parameter [citation needed]: It is desirable to use a cooling schedule to produce convergence: see Simulated annealing.

It is done by finding the nearest group with the data dimensions available, then predicting the result based on the values for the missing dimensions, assuming that they will have the same value as the group's centroid.

A lower-space vector requires less storage space, so the data is compressed.

In some cases, a codebook can be also used to entropy code the discrete value in the same step, by generating a prefix coded variable-length encoded value as its output.

The usage of video codecs based on vector quantization has declined significantly in favor of those based on motion compensated prediction combined with transform coding, e.g. those defined in MPEG standards, as the low decoding complexity of vector quantization has become less relevant.

[6] Recently it has also been used for efficient nearest neighbor search [7] and on-line signature recognition.

The codebook that provides the smallest vector quantization distortion indicates the identified user.

The main advantage of VQ in pattern recognition is its low computational burden when compared with other techniques such as dynamic time warping (DTW) and hidden Markov model (HMM).

The main drawback when compared to DTW and HMM is that it does not take into account the temporal evolution of the signals (speech, signature, etc.)

In order to overcome this problem a multi-section codebook approach has been proposed.

By aiming to minimize the expected squared quantization error[10] and introducing a decreasing learning gain fulfilling the Robbins-Monro conditions, multiple iterations over the whole data set with a concrete but fixed number of prototypes converges to the solution of k-means clustering algorithm in an incremental manner.

VQ has been used to quantize a feature representation layer in the discriminator of Generative adversarial networks.

[11] It improves the GAN training, and yields an improved performance on a variety of popular GAN models: BigGAN for image generation, StyleGAN for face synthesis, and U-GAT-IT for unsupervised image-to-image translation.

Subtopics Related topics Part of this article was originally based on material from the Free On-line Dictionary of Computing and is used with permission under the GFDL.