[1] It is achieved by using known haplotypes in a population, for instance from the HapMap or the 1000 Genomes Project in humans, thereby allowing to test for association between a trait of interest (e.g. a disease) and experimentally untyped genetic variants, but whose genotypes have been statistically inferred ("imputed").
[8] Additional phasing tools such as SHAPEIT2[9] allow prephasing of input haplotypes for improved imputation accuracy and computational performance.
As of mid-2014, whole-genome sequence data is publicly available from the 1000 Genomes Project website[11] for 2535 individuals from 26 different populations around the world.
Designing accurate statistical models for genotype imputation is very much related to the problem of haplotype estimation ("phasing") and is an active area of research.
[1][3] As of 2022, all modern phasing and imputation software are based on the Li & Stevens hidden Markov model construct.