Jackknife resampling

It is especially useful for bias and variance estimation.

The jackknife pre-dates other common resampling methods such as the bootstrap.

[1] The jackknife technique was developed by Maurice Quenouille (1924–1973) from 1949 and refined in 1956.

John Tukey expanded on the technique in 1958 and proposed the name "jackknife" because, like a physical jack-knife (a compact folding knife), it is a rough-and-ready tool that can improvise a solution for a variety of problems even though specific problems may be more efficiently solved with a purpose-designed tool.

[2] The jackknife is a linear approximation of the bootstrap.

[2] The jackknife estimator of a parameter is found by systematically leaving out each observation from a dataset and calculating the parameter estimate over the remaining observations and then aggregating these calculations.

For example, if the parameter to be estimated is the population mean of random variable

the natural estimator is the sample mean: where the last sum used another way to indicate that the index

jackknife replicates are averaged: One may ask about the bias and the variance of

as the average of the jackknife replicates one could try to calculate explicitly.

The bias is a trivial calculation, but the variance of

is more involved since the jackknife replicates are not independent.

For the special case of the mean, one can show explicitly that the jackknife estimate equals the usual estimate: This establishes the identity

However, these properties do not generally hold for parameters other than the mean.

This simple example for the case of mean estimation is just to illustrate the construction of a jackknife estimator, while the real subtleties (and the usefulness) emerge for the case of estimating other parameters, such as higher moments than the mean or other functionals of the distribution.

could be used to construct an empirical estimate of the bias of

so this construction does not add any meaningful knowledge, but it gives the correct estimation of the bias (which is zero).

can be calculated from the variance of the jackknife replicates

:[3][4] The left equality defines the estimator

and the right equality is an identity that can be verified directly.

The jackknife technique can be used to estimate (and correct) the bias of an estimator calculated over the entire sample.

is the target parameter of interest, which is assumed to be some functional of the distribution of

Based on a finite set of observations

is sample-dependent, so this value will change from one random sample to another.

from several samples, and average them, to calculate an empirical approximation of

, but this is impossible when there are no "other samples" when the entire set of available observations

In this kind of situation the jackknife resampling technique may be of help.

We construct the jackknife replicates: where each replicate is a "leave-one-out" estimate based on the jackknife subsample consisting of all but one of the data points: Then we define their average: The jackknife estimate of the bias of

is given by: and the resulting bias-corrected jackknife estimate of

[2] The jackknife technique can be also used to estimate the variance of an estimator calculated over the entire sample.

Schematic of Jackknife Resampling