In statistical classification, the Fisher kernel, named after Ronald Fisher, is a function that measures the similarity of two objects on the basis of sets of measurements for each object and a statistical model. In a classification procedure, the class for a new object (whose real class is unknown) can be estimated by minimising, across classes, an average of the Fisher kernel distance from the new object to each known member of the given class.
The Fisher kernel is the kernel for a generative probabilistic model. As such, it constitutes a bridge between generative and probabilistic models of documents.[2] Fisher kernels exist for numerous models, notably tf–idf,[3]Naive Bayes and probabilistic latent semantic analysis.
Image classification and retrieval
The Fisher kernel can also be applied to image representation for classification or retrieval problems. Currently, the most popular bag-of-visual-words representation suffers from sparsity and high dimensionality. The Fisher kernel can result in a compact and dense representation, which is more desirable for image classification[4] and retrieval[5][6] problems.
The Fisher Vector (FV), a special, approximate, and improved case of the general Fisher kernel,[7] is an image representation obtained by pooling local image features. The FV encoding stores the mean and the covariance deviation vectors per component k of the Gaussian-Mixture-Model (GMM) and each element of the local feature descriptors together. In a systematic comparison, FV outperformed all compared encoding methods (Bag of Visual Words (BoW), Kernel Codebook encoding (KCB), Locality Constrained Linear Coding (LLC), Vector of Locally Aggregated Descriptors (VLAD)) showing that the encoding of second order information (aka codeword covariances) indeed benefits classification performance.[8]
^Tommi Jaakola and David Haussler (1998), Exploiting Generative Models in Discriminative Classifiers. In Advances in Neural Information Processing Systems 11, pages 487–493. MIT Press. ISBN978-0-262-11245-1PS, Citeseer
Nello Cristianini and John Shawe-Taylor. An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, 2000. ISBN0-521-78019-5([1] SVM Book)