[2] Repeated sequences are categorized into different classes depending on features such as structure, length, location, origin, and mode of multiplication.
Overall, repeated sequences are an important area of focus because they can provide insight into human diseases and genome evolution.
[2] In the 1950s, Barbara McClintock first observed DNA transposition and illustrated the functions of the centromere and telomere at the Cold Spring Harbor Symposium.
In the 1990s, more research was conducted to elucidate the evolutionary dynamics of minisatellite and microsatellite repeats because of their importance in DNA-based forensics and molecular ecology.
[6] In the 2000s, the data from full eukaryotic genome sequencing enabled the identification of different promoters, enhancers, and regulatory RNAs which are all coded by repetitive regions.
Many repeat sequences are likely to be non-functional, decaying remnants of Transposable elements, these have been labelled "junk" or "selfish" DNA.
Recombination is important as a source of genetic diversity, as a mechanism for repairing damaged DNA, and a necessary step in the appropriate segregation of chromosomes in meiosis.
[14] The presence of repeated sequence DNA makes it easier for areas of homology to align, thereby controlling when and where recombination occurs.
[15] These repeats fold into highly organized G quadruplex structures which protect the ends of chromosomal DNA from degradation.
[16] Pericentromeric heterochromatin, the DNA which surrounds the centromere and is important for structural maintenance, is composed of a mixture of different satellite subfamilies including the α-, β- and γ-satellites as well as HSATII, HSATIII, and sn5 repeats.
[24] Since uncontrolled propagation of TEs could wreak havoc on the genome, many regulatory mechanisms have evolved to silence their spread, including DNA methylation, histone modifications, non-coding RNAs (ncRNAs) including small interfering RNA (siRNA), chromatin remodelers, histone variants, and other epigenetic factors.
[22] In some cases, host organisms find new functions for the proteins which arise from expressing TEs in an evolutionary process called TE exaptation.
[25] Furthermore, TEs contribute to regulating the expression of other genes by serving as distal enhancers and transcription factor binding sites.
Homologous recombination between chromosomal repeated sequences in somatic cells of Nicotiana tabacum was found to be increased by exposure to mitomycin C, a bifunctional alkylating agent that crosslinks DNA strands.
Inverted repeats can play structural roles in DNA and RNA by forming stem loops and cruciforms.
[29] Trinucleotide repeat expansions in the germline over successive generations can lead to increasingly severe manifestations of the disease.
[30] Huntington's disease is a neurodegenerative disorder which is due to the expansion of repeated trinucleotide sequence CAG in exon 1 of the huntingtin gene (HTT).
This gene is responsible for encoding the protein huntingtin which plays a role in preventing apoptosis,[31] otherwise known as cell death, and repair of oxidative DNA damage.
[32] In Huntington's disease the expansion of the trinucleotide sequence CAG encodes for a mutant huntingtin protein with an expanded polyglutamine domain.
[33] This domain causes the protein to form aggregates in nerve cells preventing normal cellular function and resulting in neurodegeneration.
Many researchers have historically left out repetitive sequences when analyzing and publishing whole genome data due to technical limitations.
[42] The method combines the use of a linear vector for stabilization and exonuclease III for deletion of continuing simple sequence repeats (SSRs) rich regions.