Epitranscriptomic sequencing

[4][3] The two methods were optimized to detect methylation peaks in poly(A)+ mRNA, but the protocol could be adapted to profile any type of RNA.

These two methods had several drawbacks: (1) required substantial input material, (2) low resolution which made pinpointing the actual site with the m6A mark difficult, and (3) cannot directly assess false positives.

[6] By additionally referencing the m6A consensus motif and eliminating false positive m6A peaks using negative control samples, the m6A profiling in yeast was able to be done at single-base resolution.

[6] Peptide fragments that remain after antibody removal from RNA cause the base to be read as a C as opposed to a T during reverse transcription, effectively inducing a point mutation at the 4SU crosslinking site.

Another caveat is that position of 4SU incorporation can vary relative to any single m6A residue, so it still remains challenging to precisely locate m6A site using the T to C mutation.

These UV-based strategies uses antibodies that induces consistent and predictable mutational and truncation patterns in the cDNA strand during reverse-transcription that could be leveraged to more precisely locate the m6A site.

[7][8] Though both m6A-CLIP and miCLIP reply on UV induced mutations, m6A-CLIP[7] is distinct by taking advantage that m6A alone can induce cDNA truncation during reverse transcription and generate single-nucleotide mapping for over ten folds more precise m6A sites (MITS, m6A-induced truncation sites), permitting comprehensive and unbiased precise m6A mapping.

The precise location of tens of thousands of m6A sites in human and mouse mRNAs by m6A-CLIP reveals that m6A is enriched at last exon but not around stop codon.

[8] SCARLET (site-specific cleavage and radioactive-labeling followed by ligation-assisted extraction and thin-layer chromatography) is used determining the fraction of RNA in a sample that carries a methylated adenine at a specific site.

The chimeric oligonucleotide serves as a guide to allow RNase H to cleave the RNA strand precisely at the 5’-end of the candidate site.

This radiolabeled product is then isolated and digested by nuclease to generate a mixture of modified and unmodified adenosines (5’P-m6A and 5’-P-A) which is separated using thin layer chromatography.

Despite the advances in m6A-sequencing, several challenges still remain: (1) A method has yet to be developed that characterizes the stoichiometry between different sites in the same transcript; (2) Analysis results are heavily dependent on the bioinformatics algorithm used to call the peaks; (3) Current methods all use m6A-specific antibodies to tag m6A sites, but it has been reported that the antibodies contain intrinsic bias for RNA sequences.

The 2'-O-methylation of the ribose moiety is one of the most common RNA modifications and is present in diverse highly abundant non-coding RNAs (ncRNAs) and at the 5' cap of mRNAs.

A novel method, Nm-REP-seq, was developed for the transcriptome-wide identification of 2'-O-methylation sites at single-base resolution by using RNA exoribonuclease (Mycoplasma genitalium RNase R, MgR) and periodate oxidation reactivity to eliminate 2'-hydroxylated (2'-OH) nucleosides.

As a result, it is difficult to pre-enrich RNA molecules or to obtain enough PCR product of the correct size for deep sequencing.

After bisulfite treatment of fragmented RNA, reverse transcription is performed, followed by PCR amplification of the cDNA products, and finally deep sequencing was done using the Roche 454 platform.

[1] Aza-IP 5-azacytidine-mediated RNA immunoprecipitation has been optimized on and used for detecting targets of methyltransferases, particularly NSUN2 and DNMT2[18] — the two main enzymes responsible for laying down the m5C mark.

[18] An important additional feature is that RNA methyltransferase covalent linkage to the C5 of m-aza-C induces rearrangement and ring opening.

Both miCLIP and Aza-IP, though limited by specific targeting of enzymes, can allow for the detection of low-abundance methylated RNA without deep sequencing.

One particular pipeline, called RNA and DNA differences (RDD), claims to excludes false positives, but only 56.8% of its A-to-I sites were found to be valid by ICE-seq[19] (see below).

The background noise caused by single nucleotide polymorphisms (SNPs), somatic mutations, pseudogenes and sequencing errors reduce the reliability of the signal, especially in a single-cell context.

RNA samples are treated with glyoxal and borate to specifically modify all G bases, and subsequently enzymatically digested to by RNase T1, which cleaves after I sites.

[22] The original ICE protocol involved an RT-PCR amplification step and therefore required primers and knowledge of the location or regions to be investigated,[23] alongside a maximum cDNA length of 300–500bp.

[24] Both ICE and ICE-seq suffer from a lack of sensitivity to infrequently edited locations: it becomes difficult to distinguish a modification with a frequency of <10% from a false positive.

The knockdown of these in the cell, therefore, and the subsequent cell–cell comparison of ADAR+ and ADAR- RNA content would be anticipated to provide a basis for A-to-I modification profiling.

[33] As neither of these changes affect its base-pairing properties, both will have the same output when directly sequenced; therefore methods for its detection involve prior biochemical modification.

The method causes a lot of RNA degradation, so it is necessary to start with a large amount of sample, or use effective normalisation techniques to account for amplification biases.

To observe the in vivo addition of methyl groups to cytosine RNA residues followed by oxidative processing, mice can be fed on a diet incorporating particular isotopes and these be traced by LC-MS/MS analysis.

This technique exploits nanometer-sized protein channels embedded into a membrane or solid materials, and coupled to sensors, able to detect the amplitude and duration of the variations of the ionic current passing through the pore.

By producing single-molecule reads, without previous RNA amplification and conversion to cDNA, these techniques can lead to the production of quantitative transcriptome-wide maps.

Schematic diagram of epitranscriptomic sequencing workflows.
The major classes of RNA modifications.
Methods developed to profile N 6 -methyladenosine.
Methods developed to profile 5-methylcytidine.
Method of inosine identification: Inosine base pairs with cytidine while adenosine pairs with uracil.
ICE-based and CMC-based detection of inosine and pseudouridine. In both cases a truncated cDNA molecule is produced.