In genetics and bioinformatics, a single-nucleotide polymorphism (SNP /snɪp/; plural SNPs /snɪps/) is a germline substitution of a single nucleotide at a specific position in the genome.
Although certain definitions require the substitution to be present in a sufficiently large fraction of the population (e.g. 1% or more),[1] many publications[2][3][4] do not apply such a frequency threshold.
For example, a common SNP in the CFH gene is associated with increased risk of age-related macular degeneration.
[8] "Variant" may also be used as a general term for any single nucleotide change in a DNA sequence,[9] encompassing both common SNPs and rare mutations, whether germline or somatic.
However, this pattern of variation is relatively rare; in a global sample of 67.3 million SNPs, the Human Genome Diversity Project "found no such private variants that are fixed in a given continent or major region.
[38] Genome-wide genetic data can be generated by multiple technologies, including SNP array and whole genome sequencing.
Since GWAS is a genome-wide assessment, a large sample site is required to obtain sufficient statistical power to detect all possible associations.
To estimate study power, the genetic model for disease needs to be considered, such as dominant, recessive, or additive effects.
[41][42] Moreover, cosmopolitan studies in European and South Asiatic populations have revealed the influence of SNPs in the methylation of specific CpG sites.
[43] In addition, meQTL enrichment analysis using GWAS database, demonstrated that those associations are important toward the prediction of biological traits.
However, in instances with degraded or small volume samples, SNP techniques are an excellent alternative to STR methods.
SNPs (as opposed to STRs) have an abundance of potential markers, can be fully automated, and a possible reduction of required fragment length to less than 100 bp.
[26] Pharmacogenetics focuses on identifying genetic variations including SNPs associated with differential responses to treatment.
These findings have significantly improved understanding of disease pathogenesis and molecular pathways, and facilitated development of better treatment.
Some include: DNA sequencing; capillary electrophoresis; mass spectrometry; single-strand conformation polymorphism (SSCP); single base extension; electrochemical analysis; denaturating HPLC and gel electrophoresis; restriction fragment length polymorphism; and hybridization analysis.
An important group of SNPs are those that corresponds to missense mutations causing amino acid change on protein level.
Usually, change in amino acids with similar size and physico-chemical properties (e.g. substitution from leucine to valine) has mild effect, and opposite.
Using those simple and many other machine learning derived rules a group of programs for the prediction of SNP effect was developed:[66]