Long non-coding RNA

[3] Given that some lncRNAs have been reported to have the potential to encode small proteins or micro-peptides, the latest definition of lncRNA is a class of transcripts of over 200 nucleotides that have no or limited coding capacity.

[4] However, John S. Mattick and colleagues suggested to change definition of long non-coding RNAs to transcripts more than 500 nt, which are mostly generated by Pol II.

[6] Long non-coding RNAs include intergenic lincRNAs, intronic ncRNAs, and sense and antisense lncRNAs, each type showing different genomic positions in relation to genes and exons.

[9] The FANTOM3 project identified ~35,000 non-coding transcripts that bear many signatures of messenger RNAs, including 5' capping, splicing, and poly-adenylation, but have little or no open reading frame (ORF).

The GENCODE consortium has collated and analysed a comprehensive set of human lncRNA annotations and their genomic organisation, modifications, cellular locations and tissue expression profiles.

Noncoding RNAs act upon different aspects of this process, targeting transcriptional modulators, RNA polymerase (RNAP) II and even the DNA duplex to regulate gene expression.

The existence of other similar ultra- or highly conserved elements within the mammalian genome that are both transcribed and fulfill enhancer functions suggest Evf-2 may be illustrative of a generalised mechanism that regulates developmental genes with complex expression patterns during vertebrate growth.

[65][66] Indeed, the transcription and expression of similar non-coding ultraconserved elements was shown to be abnormal in human leukaemia and to contribute to apoptosis in colon cancer cells, suggesting their involvement in tumorigenesis in like fashion to protein-coding RNA.

The recruitment of TLS to the promoter of cyclin D1 is directed by long ncRNAs expressed at low levels and tethered to 5' regulatory regions in response to DNA damage signals.

In the broad sense, this mechanism allows the cell to harness RNA-binding proteins, which make up one of the largest classes within the mammalian proteome, and integrate their function in transcriptional programs.

[74] This novel mechanism of regulating gene expression may represent a widespread method of controlling promoter usage, as thousands of RNA-DNA triplexes exist in eukaryotic chromosome.

These examples, which bypass specific modes of regulation at individual promoters provide a means of quickly affecting global changes in gene expression.

This prompted the authors to posit a 'cogene/gene' functional regulatory network,[95] showing that one of these ncRNAs, 21A, regulates the expression of its antisense partner gene, CENP-F in trans.

Likewise, the expression of an overlapping antisense Rev-ErbAa2 transcript controls the alternative splicing of the thyroid hormone receptor ErbAa2 mRNA to form two antagonistic isoforms.

[102] Indeed, it was recently shown that BC1 is associated with translational repression in dendrites to control the efficiency of dopamine D2 receptor-mediated transmission in the striatum[103] and BC1 RNA-deleted mice exhibit behavioural changes with reduced exploration and increased anxiety.

However, the generation of endo-siRNAs from antisense transcripts or pseudogenes may also silence the expression of their functional counterparts via RISC effector complexes, acting as an important node that integrates various modes of long and short RNA regulation, as exemplified by the Xist and Tsix (see above).

[115] In Drosophila, long ncRNAs induce the expression of the homeotic gene, Ubx, by recruiting and directing the chromatin modifying functions of the trithorax protein Ash1 to Hox regulatory elements.

[114] Similar models have been proposed in mammals, where strong epigenetic mechanisms are thought to underlie the embryonic expression profiles of the Hox genes that persist throughout human development.

HOTAIR is thought to achieve this by directing the action of Polycomb chromatin remodeling complexes in trans to govern the cells' epigenetic state and subsequent gene expression.

[120] A detailed analysis showed the p15 antisense ncRNA (CDKN2BAS) was able to induce changes to heterochromatin and DNA methylation status of p15 by an unknown mechanism, thereby regulating p15 expression.

[125] Similar to HOTAIR (see above), Eed-Ezh2 Polycomb complexes are recruited to the Kcnq1 loci paternal chromosome, possibly by Kcnqot1, where they may mediate gene silencing through repressive histone methylation.

[125] A differentially methylated imprinting centre also overlaps the promoter of a long antisense ncRNA Air that is responsible for the silencing of neighbouring genes at the Igf2r locus on the paternal chromosome.

Gene mutations or variation in expression levels of such RNAs can lead to local DNA repair defects, increasing the chromosome aberration frequency.

The first published report of an alteration in lncRNA abundance in aging and human neurological disease was provided by Lukiw et al.[140] in a study using short post-mortem interval tissues from patients with Alzheimer's disease and non-Alzheimer's dementia (NAD) ; this early work was based on the prior identification of a primate brain-specific cytoplasmic transcript of the Alu repeat family by Watson and Sutcliffe in 1987 known as BC200 (brain, cytoplasmic, 200 nucleotide).

For example, in prostate tumours, PCGEM1 (one of two overexpressed ncRNAs) is correlated with increased proliferation and colony formation suggesting an involvement in regulating cell growth.

[143] MALAT1 (also known as NEAT2) was originally identified as an abundantly expressed ncRNA that is upregulated during metastasis of early-stage non-small cell lung cancer and its overexpression is an early prognostic marker for poor patient survival rates.

Further analysis of one ultraconserved ncRNA suggested it behaved like an oncogene by mitigating apoptosis and subsequently expanding the number of malignant cells in colorectal cancers.

For example, the induction of an antisense transcript by a genetic mutation led to DNA methylation and silencing of sense genes, causing β-thalassemia in a patient.

A third hypothesis posits that lncRNAs might exhibit a largely unstructured architecture, with loosely organized protein-binding domains interspersed with long regions of disordered single-stranded RNA.

[162] Studying the tertiary structure of lncRNAs by conventional methods such as X- ray crystallography, cryo-EM and nuclear magnetic resonance (NMR) is unfortunately still hampered by their size and conformational dynamics, and by the fact that for now we still know too little about their mechanism to reconstruct stable and functionally-active lncRNA-ribonucleoprotein complexes.

Different types of long non-coding RNAs. [ 1 ]