One-shot learning is an object categorization problem, found mostly in computer vision.
The ability to learn object categories from few examples, and at a rapid pace, has been demonstrated in humans.
[1][2] It is estimated that a child learns almost all of the 10 ~ 30 thousand object categories in the world by age six.
Given two examples from two object categories: one, an unknown object composed of familiar shapes, the second, an unknown, amorphous shape; it is much easier for humans to recognize the former than the latter, suggesting that humans make use of previously learned categories when learning new ones.
The Bayesian one-shot learning algorithm represents the foreground and background of images as parametrized by a mixture of constellation models.
For object recognition on new images, the posterior obtained during the learning phase is used in a Bayesian decision framework to estimate the ratio of p(object | test, train) to p(background clutter | test, train) where p is the probability of the outcome.
To compute these probabilities, the object class must be modeled from a set of (1 ~ 5) training images containing examples.
We next introduce parametric models for the foreground and background categories with parameters
yields The posterior distribution of model parameters given the training images,
, first a set of N interesting regions is detected in the image using the Kadir–Brady saliency detector.
, the collection of part locations) and appearance are independent allows one to consider the likelihood expression
in the constellation model has a Gaussian density within this space with mean and precision parameters
From these the appearance likelihood described above is computed as a product of Gaussians over the model parts for a give hypothesis h and mixture component
and hypothesis h is represented as a joint Gaussian density of the locations of features.
These features are transformed into a scale and translation-invariant space before modelling the relative location of the parts by a 2(P - 1)-dimensional Gaussian.
, only those hypotheses that satisfy the ordering constraint that the x-coordinate of each part is monotonically increasing are considered.
However, because in one-shot learning, few training examples are used, the distribution will not be well-peaked, as is assumed in a
Thus instead of this traditional approximation, the Bayesian one-shot learning algorithm seeks to "find a parametric form of
is a product of Gaussians, as chosen in the object category model, the integral reduces to a multivariate Student's T distribution, which can be evaluated.
[22] To obtain shape and appearance priors, three categories (spotted cats, faces, and airplanes) are learned using maximum likelihood estimation.
These object category model parameters are then used to estimate the hyper-parameters of the desired priors.
Given a set of training examples, the algorithm runs the feature detector on these images, and determines model parameters from the salient regions.
The hypothesis index h assigning features to parts prevents a closed-form solution of the linear model, so the posterior
is estimated by variational Bayesian expectation–maximization algorithm, which is run until parameter convergence after ~ 100 iterations.
Learning a category in this fashion takes under a minute on a 2.8 GHz machine with a 4-part model and < 10 training images.
is the binary random variable defined by the values of a particular pixel p across all of the images,
[25] To use this model for classification, it must be estimated with the maximum posterior probability given an observed image
, the test image I is inserted into the training ensemble for the congealing process.
obtained from congealing many images of a certain category, the classifier can be extended to the case where only one training
This artificial data set can be made larger by borrowing transformations from many already known categories.