[1] The original meeting gathered international scientists from diverse backgrounds to help annotate the function of mouse cDNA clones generated by the Hayashizaki group.
[2] Since the initial FANTOM1 effort, the consortium has released multiple projects that look to understand the mechanisms governing the regulation of mammalian genomes.
[1] Their work has generated a large collection of shared data and helped advance biochemical and bioinformatic methodologies in genomics research.
Current methodologies were insufficient to generate full length cDNA clones at scale, and to be useful as a resource the annotations would have to be agreed upon by experts across different disciplines.
Reverse transcriptase protocols at the time had difficulties with the secondary structure of mRNA, leading to abbreviated cDNAs that were difficult to align and invited further complications in downstream analysis.
To surpass this limitation, a method utilizing trehalose was developed to allow reverse transcriptase to function at a higher temperature, relaxing secondary structures.
To facilitate the annotation of the mouse cDNA clones, the RIKEN research group developed a web-based service called FANTOM+ prior to the first meeting.
Predominant tools included BLASTN/BLASTX, FASTA/FASTY, DECODER, EST-WISE and HMMER, while both nucleic acid and protein databases such as SwissProt, UniGene and NCBI-nr were utilized.
Concurrently, a collaboration with the Mouse Genome Informatics group (MGI) allowed the RIKEN researchers to establish a validated set of clones that were identical between the two databases.
[1][2] Armed with computational methodologies and over 20,000 cDNA sequences, the RIKEN group organized the first FANTOM meeting in Tsukuba City from August 28 to September 8, 2000.
This provided a hierarchical and systematic means to assign functions to the clones based upon known genes, placing priority on previously established or well-curated knowledge.
[1][2][8] Having established and improved upon the protocols for full-length cDNA library generation, the RIKEN group continued to add to the FANTOM collection.
Arguably the most notable result of FANTOM2 was that efforts to select for long and rare transcripts had revealed a significant amount of non protein-coding RNA.
While full length mouse cDNAs continued to be generated, the RIKEN-led researchers established Cap Analysis of Gene Expression (CAGE), a technique that would drive much of their future work.
This entails adding biotin to the 5' cap, and subsequent capture with streptavidin beads after an RNase digestion step to remove single stranded RNA that has not hybridized to cDNA.
Importantly, RNA was found to be much more abundant in the mammalian transcriptome than previously thought, accompanied with the realization that the genome was pervasively transcribed.
[18] Another notable result showed that many non-coding RNAs are dynamically expressed, with many being initiated in 3’ untranslated regions, and that they are positionally conserved across species.
This study also demonstrated that CpG-rich promoters may be bidirectional (produce sense-antisense pairs), and are highly susceptible epigenetic control and are thus a potential component of adaptive evolution.
While previous FANTOM projects examined a range of cell types, FANTOM4's purpose was to deeply interrogate the dynamics driving cellular differentiation.
[24][25] It was demonstrated that retrotransposons are expressed in a cell and tissue specific manner, and approximately 250,000 previously unknown retrotransposon-driven TSSs were identified.
FANTOM4 led to numerous satellite papers, investigating topics like promoter architecture, miRNA regulation and genomic regulatory blocks.
FANTOM5 focused solely on the transcriptome, relying on other published work to infer features like cell type as defined by chromatin status.
The first phase of FANTOM5 involved taking ‘snapshots’ of a wide range of steady state cell types using CAGE profiling across 975 human and 399 mouse samples.
[35][36] Together, they provide an atlas of promoters, enhancers and TSSs across diverse cell types, acting as a ‘baseline’ for studying the complex landscape of transcription regulation.
[1][33][38] Unsupervised clustering was performed to identify a set of distinct response classes, examining patterns in expression fold changes compared to time 0.
ZENBU is a genome browser with additional functionality: users can upload BAM files of CAGE, short-RNA and ChIP-seq experiments and perform quality control, normalization, peak finding and annotation among visual comparisons.
[40] The bounty of data produced by FANTOM5 continues to provide a resource for researchers looking to explain the regulatory mechanisms that shape processes like development.
Often CAGE data in a specific cell/tissue type is used in conjunction with further epigenomic assays - one such example describes the interplay of DNA methylation and CAGE-defined regulatory sequences during differentiation of a granulocyte.
Based upon the few works that have examined lncRNA, it is believed that they are involved in regulating transcription, translation, post-translational modifications, and epigenetic marks.
Next, using lncRNAs identified in previous publications, FANTOM5 data and further CAGE profiling, perturbation experiments will be conducted to evaluate changes in cellular molecular phenotype.