Fisher kernel

[1] It combines the advantages of generative statistical models (like the hidden Markov model) and those of discriminative methods (like support vector machines): The Fisher kernel makes use of the Fisher score, defined as with θ being a set (vector) of parameters.

[2] Fisher kernels exist for numerous models, notably tf–idf,[3] Naive Bayes and probabilistic latent semantic analysis.

The Fisher kernel can result in a compact and dense representation, which is more desirable for image classification[4] and retrieval[5][6] problems.

The FV encoding stores the mean and the covariance deviation vectors per component k of the Gaussian-Mixture-Model (GMM) and each element of the local feature descriptors together.

In a systematic comparison, FV outperformed all compared encoding methods (Bag of Visual Words (BoW), Kernel Codebook encoding (KCB), Locality Constrained Linear Coding (LLC), Vector of Locally Aggregated Descriptors (VLAD)) showing that the encoding of second order information (aka codeword covariances) indeed benefits classification performance.