Copy number variation

One of the most well known examples of a short copy number variation is the trinucleotide repeat of the CAG base pairs in the huntingtin gene responsible for the neurological disorder Huntington's disease.

[6] The number of repeats of the CAG trinucleotide is inversely correlated with the age of onset of Huntington's disease.

[1] Lastly, spatial biases of the location at which copy number variations are most densely distributed does not seem to occur in the genome.

[1] Although it was originally detected by fluorescent in situ hybridization and microsatellite analysis that copy number repeats are localized to regions that are highly repetitive such as telomeres, centromeres, and heterochromatin,[11] recent genome-wide studies have concluded otherwise.

[2] Copy number variation was initially thought to occupy an extremely small and negligible portion of the genome through cytogenetic observations.

[10] Initially these advances involved using bacterial artificial chromosome (BAC) array with around 1 megabase of intervals throughout the entire gene,[14] BACs can also detect copy number variations in rearrangement hotspots allowing for the detection of 119 novel copy number variations.

[10] In addition, another way of detecting copy number variation is using single nucleotide polymorphisms (SNPs).

[10] Due to the abundance of the human SNP data, the direction of detecting copy number variation has changed to utilize these SNPs.

One of the best-recognized theories that leads to copy number variations as well as deletions and inversions is non-allelic homologous recombinations.

[19] During meiotic recombination, homologous chromosomes pair up and form two ended double-stranded breaks leading to Holliday junctions.

However, in the aberrant mechanism, during the formation of Holliday junctions, the double-stranded breaks are misaligned and the crossover lands in non-allelic positions on the same chromosome.

When the Holliday junction is resolved, the unequal crossing over event allows transfer of genetic material between the two homologous chromosomes, and as a result, a portion of the DNA on both the homologues is repeated.

Another type of homologous recombination based mechanism that can lead to copy number variation is known as break induced replication.

[20] Errors in repairing the break, similar to non-allelic homologous recombination, can lead to an increase in copy number of a particular region of the genome.

[20] As in the non-allelic homologous recombination mechanism, an extra copy of a particular region is transferred to another chromosome, leading to a duplication event.

Another mechanism is the break-fusion-bridge cycle which involves sister chromatids that have both lost its telomeric region due to double stranded breaks.

[23] It is proposed that these sister chromatids will fuse together to form one dicentric chromosome, and then segregate into two different nuclei.

[24] Note that although this has been experimentally observed and is a widely accepted mechanism, the molecular interactions that led to this error remains unknown.

[9] AMY1 is one of the most well studied genes which has wide range of variable numbers of copies throughout different human populations.

[9] As a result, it was hypothesized that the copy number of the AMY1 gene is closely correlated with its protein function, which is to digest starch.

[9] It was hypothesized that the levels of starch in one’s regular diet, the substrate for AMY1, can directly affect the copy number of the AMY1 gene.

[9] This implies that natural selection played a considerable role in shaping the average number of AMY1 genes in these two populations.

[9] However, as only six populations were studied, it is important to consider the possibility that there may be other factors in their diet or culture that influenced the AMY1 copy number other than starch.

[9] It can be inferred from the results that the increase in bonobo AMY1 copy number is likely not correlated to the amount of starch in their diet.

[9] This hypothesis, although logical, lacks experimental evidence due to the difficulties in gathering information on the shift of human diets, especially on root vegetables that are high in starch as they cannot be directly observed or tested.

[28] Genomic duplication and triplication of the gene appear to be a rare cause of Parkinson's disease, although more common than point mutations.

This gene duplication has created a copy number variation. The chromosome now has two copies of this section of DNA, rather than one.
Diagrammatic representation of non-allelic homologous recombination. Here, Gene X represents the gene of interest and the black line represents the chromosome. When the two homologous chromosomes are misaligned and recombination occurs, it may result in a duplication of the gene.
Timeline of the change in hominin diet throughout late Paleolithic, Mesolithic, and Neolithic periods. As seen, root vegetables rich in starch were consumed around 20,000 years ago when the AMY1 diploid gene number is estimated to have increased.
Simplified phylogenetic tree of the great ape lineage and the number of diploid AMY1 genes that each species has. AMY1 gene number shown to increase after split with the chimpanzee lineage.
Possible mechanism of how multiple copies of a gene can lead to a protein family over years with natural selection. Here, Gene X is the gene of interest that is duplicated and Gene X1 and Gene X2 are genes that acquired mutations and became functionally different to Gene X.