Helitron (biology)

[1] They were first discovered in plants (Arabidopsis thaliana and Oryza sativa) and in the nematode Caenorhabditis elegans, and now they have been identified in a diverse range of species, from protists to mammals.

Helitrons make up a substantial fraction of many genomes where non-autonomous elements frequently outnumber the putative autonomous partner.

Kapitonov and Jurka investigated the coding capacity of Helitrons in A. thaliana, Oryza sativa, and Caenorhabditis elegans using in silico studies of repetitive DNA of these organisms, computational analysis and Monte Carlo simulation.

They described the structure and coding potential of canonical Helitrons and proposed the rolling-circle mechanism of transposition as well as the possibility that some of the encoded genes captured from the host are now used for replication.

The Rep/Helicase proteins were predicted to be 500 to 700 amino acids longer because of a C-terminal fusion of a domain with homology to apurinic-apyrimidinic (AP) endonuclease.

In recent years, Helitrons have been identified in all eukaryotic kingdoms but their genomic copy numbers are highly variable, even among closely related species.

[2] In most mammals Helitron's presence is negligible and limited to remnants of old transposons, with the exception of bat genomes, which are populated by numerous young elements.

[7] However, many years after the description autonomous Helitrons, no mechanistic studies have been published and therefore the rolling-circle mechanism of transposition remains a well-supported but not yet tested hypothesis.

[1] Helitrons are structurally asymmetric and are the only class of eukaryotic DNA transposons that do not generate duplications of target sites during transposition.

The Rep/Helicase protein includes zinc finger motifs, the Rep domain (which is a ~100-aa and has HUH endonuclease activity), and an eight-domain PiF1 family helicase (SuperFamily1) which are universally conserved in Helitrons.

[2] The three-dimensional structure of Helitron transposase covalently bound to the left transposon end has been recently determined by cryoEM.

If the palindrome and 3' end of the element are recognized correctly, cleavage occurs after the CTRR sequence and the one Helitron strand is transferred to the donor site where DNA replication resolves the heteroduplex.

An antibiotic resistance gene was included between the two terminal sequences of the helitron to enable isolation of the cells where transposition occurred.

This model is supported by the fact that the deletion of one of the two tyrosines (Y727) of the Rep domain thought to be involved in cleavage of the strands[1] doesn't actually affect the efficiency of helitron transposition.

Only one of the tyrosines would be required,[12] in order to ensure a two-step process: 1) the cleavage of the donor DNA and 2) the integration into the target site.

[1] In this model, portions of genes or non-coding regions can accidentally serve as templates during repair of double stranded breaks (DSBs) occurring in Helitrons during their transposition.

Evidence supporting the "read-through" models seems to lie in the relative lack of importance of the 3' RTS when compared to the 5' LTS:[10][12] deletion of the LTS leads to a severe reduction in the efficiency of helitron transposition, whereas the complete deletion of the RTS still leads to significant transposition despite a reduced number of copies.

Such a small structure is likely to be modified over time, enabling to by-pass the helitron's end during its transposition and to capture neighbouring gene sequence.

[7] Helitrons drive the expression and provides de novo regulatory elements such as CAAT-box, GCbox, octamer motif, and TATA box sites.

A number of spontaneous mutations have been reported in plants that are caused by intronic Helitron insertions that result in the generation of chimeric transcript species.

As these programs are trained on known Helitron elements, they may not be efficient at identifying divergent families and they generate many false positives.

A repeat-based search requires extensive manual curation to identify Helitron families, an overwhelming task in large genomes with substantial DNA repetition.

Researchers found evidence for the repeated HT of four different families of Helitrons in an unprecedented array of organisms, including mammals, reptiles, fish, invertebrates, and insect viruses.

The Helitrons present in these species have a patchy distribution and are closely related (80–98% sequence identity), despite the deep divergence times among hosts.

Considering the young age of these families and the extent of protein conservation, it is highly unlikely that the divergence observed is resulted from mutations accumulated by the transposons integrated in the host genome, proving that Helitrons work as a powerful tool of evolution.

Structure and coding capacity of canonical animal and plant Helitrons
Rolling-Circle Mechanism for Helitron transposition and gene acquisition in the concerted model
a) Plasmid containing the helitron: the antibiotic resistance gene (kanamycin) is inserted between the left and right terminal sequences (LTS and RTS respectively) ; b) Circular intermediate of transposition: the terminal sequences are joined together (grey arrow indicates promoter of the gene)
Donor sequence (black) and target sequence (blue) ; helitron divided into three parts (LTS in blue, coding sequence in grey and RTS in purple). a) tyrosine of the Rep-Hel protein cleaves 5’ end of the LTS in the donor sequence ; b) using helicase activity from 5’ to 3’, Rep-Hel rolls to the 3’ end of the RTS ; c) cleavage of the 3’ end after detection of the RTS ; d) joining of the end sequences and formation of circle intermediate ; e) cleavage of the target strand and integration of the helitron after passive resolution
Pipeline for genome-wide identification of candidate Helitrons and their verification