LTR retrotransposons are class I transposable elements (TEs) characterized by the presence of long terminal repeats (LTRs) directly flanking an internal coding region.
Their size ranges from a few hundred base pairs to 30 kb, the largest species reported to date are members of the Burro retrotransposon family in Schmidtea mediterranea.
LTR retrotransposons are further sub-classified into the Ty1-copia-like (Pseudoviridae), Ty3-like (Metaviridae, formally referred to as Gypsy-like, a name that is being considered for retirement[4]), and BEL-Pao-like (Belpaoviridae) groups based on both their degree of sequence similarity and the order of encoded gene products.
The Pol gene produces three proteins: a protease (PR), a reverse transcriptase endowed with an RT (reverse-transcriptase) and an RNAse H domains, and an integrase (IN).
[citation needed] Reverse transcription usually initiates at a short sequence located immediately downstream of the 5'-LTR and termed the primer binding site (PBS).
Specific host tRNAs bind to the PBS and act as primers for reverse-transcription, which occurs in a complex and multi-step process, ultimately producing a double- stranded cDNA molecule.
The cDNA is finally integrated into a new location, creating short TSDs (Target Site Duplications) [10] and adding a new copy in the host genome Ty1-copia retrotransposons are abundant in species ranging from single-cell algae to bryophytes, gymnosperms, and angiosperms.
They encode four protein domains in the following order: protease, integrase, reverse transcriptase, and ribonuclease H. At least two classification systems exist for the subdivision of Ty1-copia retrotransposons into five lineages:[11][12] Sireviruses/Maximus, Oryco/Ivana, Retrofit/Ale, TORK (subdivided in Angela/Sto, TAR/Fourf, GMR/Tork), and Bianca.
This lineage is named for the founder element SIRE1 in the Glycine max genome,[13] and was later described in many species such as Zea mays,[14] Arabidopsis thaliana,[15] Beta vulgaris,[16] and Pinus pinaster.
[31] Mammalian retrotransposon-derived transcripts (MARTs) cannot transpose but have retained open reading frames, demonstrate high levels of evolutionary conservation and are subject to selective pressures, which suggests some have become neofunctionalized genes with new cellular functions.