Kriging

Under suitable assumptions of the prior, kriging gives the best linear unbiased prediction (BLUP) at unsampled locations.

The method is widely used in the domain of spatial analysis and computer experiments.

The theoretical basis for the method was developed by the French mathematician Georges Matheron in 1960, based on the master's thesis of Danie G. Krige, the pioneering plotter of distance-weighted average gold grades at the Witwatersrand reef complex in South Africa.

Krige sought to estimate the most likely distribution of gold based on samples from a few boreholes.

Though computationally intensive in its basic formulation, kriging can be scaled to larger problems using various approximation methods.

Even so, they are useful in different frameworks: kriging is made for estimation of a single realization of a random field, while regression models are based on multiple observations of a multivariate data set.

[2] The difference with the classical kriging approach is provided by the interpretation: while the spline is motivated by a minimum-norm interpolation based on a Hilbert-space structure, kriging is motivated by an expected squared prediction error based on a stochastic model.

The resulting posterior distribution is also Gaussian, with a mean and covariance that can be simply computed from the observed values, their variance, and the kernel matrix derived from the prior.

In geostatistical models, sampled data are interpreted as the result of a random process.

The fact that these models incorporate uncertainty in their conceptualization doesn't mean that the phenomenon – the forest, the aquifer, the mineral deposit – has resulted from a random process, but rather it allows one to build a methodological basis for the spatial inference of quantities in unobserved locations and to quantify the uncertainty associated with the estimator.

A stochastic process is, in the context of this model, simply a way to approach the set of data collected from the samples.

The first step in geostatistical modulation is to create a random process that best describes the set of observed data.

(generic denomination of a set of geographic coordinates) is interpreted as a realization

The proposed solution in the geostatistical formalism consists in assuming various degrees of stationarity in the random function, in order to make the inference of some statistic values possible.

The hypothesis of stationarity related to the second moment is defined in the following way: the correlation between two random variables solely depends on the spatial distance between them and is independent of their location.

are intended to summarize two extremely important procedures in a spatial inference process: When calculating the weights

, there are two objectives in the geostatistical formalism: unbias and minimal variance of estimation.

Kriging seeks to minimize the mean square value of the following error in estimating

, subject to lack of bias: The two quality criteria referred to previously can now be expressed in terms of the mean and variance of the new random variable

The variance of estimation: Solving this optimization problem (see Lagrange multipliers) results in the kriging system: The additional parameter

[9] It assumes the expectation of the random field is known and relies on a covariance function.

The practical assumptions for the application of simple kriging are: The covariance function is a crucial design choice, since it stipulates the properties of the Gaussian process and thereby the behaviour of the model.

The covariance function encodes information about, for instance, smoothness and periodicity, which is reflected in the estimate produced.

[10] For this reason, it can produce poor estimates in many real-world applications, especially when the true underlying function contains discontinuities and rapid changes.

The interpolation by simple kriging is given by The kriging error is given by which leads to the generalised least-squares version of the Gauss–Markov theorem (Chiles & Delfiner 1999, p. 159): See also Bayesian Polynomial Chaos Although kriging was developed originally for applications in geostatistics, it is a general method of statistical interpolation and can be applied within any discipline to sampled data from random fields that satisfy the appropriate mathematical assumptions.

To date kriging has been used in a variety of disciplines, including the following: Another very important and rapidly growing field of application, in engineering, is the interpolation of data coming out as response variables of deterministic computer simulations,[28] e.g. finite element method (FEM) simulations.

In this case, kriging is used as a metamodeling tool, i.e. a black-box model built over a designed set of computer experiments.

In many practical engineering problems, such as the design of a metal forming process, a single FEM simulation might be several hours or even a few days long.

Kriging is therefore used very often as a so-called surrogate model, implemented inside optimization routines.

[29] Kriging-based surrogate models may also be used in the case of mixed integer inputs.

Example of one-dimensional data interpolation by kriging, with credible intervals . Squares indicate the location of the data. The kriging interpolation, shown in red, runs along the means of the normally distributed credible intervals shown in gray. The dashed curve shows a spline that is smooth, but departs significantly from the expected values given by those means.
Simple kriging can be seen as the mean and envelope of Brownian random walks passing through the data points.