HapMap is used to find genetic variants affecting health, disease and responses to drugs and environmental factors.
The International HapMap Project is a collaboration among researchers at academic centers, non-profit biomedical research groups and private companies in Canada, China (including Hong Kong), Japan, Nigeria, the United Kingdom, and the United States.
[2] The Phase III dataset was released in spring 2009 and the publication presenting the final results published in September 2010.
Although any two unrelated people share about 99.5% of their DNA sequence, their genomes differ at specific nucleotide locations.
Such sites are known as single nucleotide polymorphisms (SNPs), and each of the possible resulting gene forms is called an allele.
Four populations were selected for inclusion in the HapMap: 30 adult-and-both-parents Yoruba trios from Ibadan, Nigeria (YRI), 30 trios of Utah residents of northern and western European ancestry (CEU), 44 unrelated Japanese individuals from Tokyo, Japan (JPT) and 45 unrelated Han Chinese individuals from Beijing, China (CHB).
[9] In phase III, 11 global ancestry groups have been assembled: ASW (African ancestry in Southwest USA); CEU (Utah residents with Northern and Western European ancestry from the CEPH collection); CHB (Han Chinese in Beijing, China); CHD (Chinese in Metropolitan Denver, Colorado); GIH (Gujarati Indians in Houston, Texas); JPT (Japanese in Tokyo, Japan); LWK (Luhya in Webuye, Kenya); MEX (Mexican ancestry in Los Angeles, California); MKK (Maasai in Kinyawa, Kenya); TSI (Tuscans in Italy); YRI (Yoruba in Ibadan, Nigeria).
So the National Institutes of Health embraced the idea for a "shortcut", which was to look just at sites on the genome where many people have a variant DNA unit.
The Canadian team was led by Thomas J. Hudson at McGill University in Montreal and focused on chromosomes 2 and 4p.
The Chinese team was led by Huanming Yang in Beijing and Shanghai, and Lap-Chee Tsui in Hong Kong and focused on chromosomes 3, 8p and 21.
During Phase II, more than two million additional SNPs were genotyped throughout the genome by David R. Cox, Kelly A. Frazer and others at Perlegen Sciences and 500,000 by the company Affymetrix.
All of the data generated by the project, including SNP frequencies, genotypes and haplotypes, were placed in the public domain and are available for download.