[1] There are an estimated 15 million SNP (Single-nucleotide polymorphism) sites (out of roughly 3 billion base pairs, or about 0.4%) from among which AIMs may potentially be selected.
Using statistical methods such as apparent error rate and Improved Bayesian Estimate, the set of SNPs with the highest accuracy for predicting a specific ancestry can be found.
A set of aAIMs can be used to identify the ancestry of ancient populations and eventually quantify the genetic similarity to modern-day individuals.
[11] An array of private companies, such as 23andMe and AncestryDNA, provide cost-effective direct-to-consumers (DTC) genetic testing by analyzing ancestry informative markers to determine geographic origins.
These private companies collect massive quantities of data such as biological samples and self-reported information from consumers, a practice known as biobanking, enabling their researchers to discover more insights on AIMs.
These types of arrays can help reduce the cost of identifying risk factors, since they allow researchers to screen for ancestry markers instead of the entire genome.
However, the study done by Yang et al. (2005) suggests that the technology to conduct deeper research into and identify ancestry-associated variations in human disease does already exist.