Epigenome-wide association study

Aberrant DNAm is the most common type of molecular abnormality in cancer cells, where the bulk genome becomes globally ‘hypomethylated’ and CPIs in promoter regions become ‘hypermethylated’, usually leading to silencing of tumour suppressor genes.

[2] More recently, studies on diabetes have uncovered further evidence to support an epigenetic component of diseases, including differences in disease-associated epigenetic marks between monozygotic twins, the rising incidence of type 1 diabetes in the general population, and developmental reprogramming events in which in utero or childhood environments can influence disease outcome in adulthood.

[1] Epigenetic variation arises in three distinct ways; it can be inherited and be therefore present in all cells of the adult including the germline (a process known as transgenerational epigenetic inheritance; a controversial phenomenon that has not yet been observed in humans); it can occur randomly and be present in a subset of cells in the adult, the amount of which depending on how early in development the variation occurs; or it can be induced as a result of behavioural or environmental factors.

Since the same individuals are followed at time points before and after disease onset, it removes the confounding effects of differences between cases and controls.

Longitudinal studies using disease-discordant monozygotic twins gives the added benefit of ruling out genetic influences on epigenetic variation.

If this isn't possible, it would be required to use multiple serially collected samples from the same individuals to report robust associations with a particular phenotype.

EWAS for diseases are often measured using DNA methylation in blood samples because disease-relevant tissues are difficult to obtain.

The choice of blood also requires stringent analysis and careful interpretation due to variable cell type composition.

To date, an underlying issue is that there is no clear evidence that, in general, epigenetic marks respond to environmental exposures in a similar way across tissues.

The initial analyses performed are univariate tests of association to identify sites where DNA methylation varies with exposure and/or phenotype.

Additionally, adjusting for confounding factors such as age, gender and behaviours that may influence the methylation status as covariates is conducted.

Generally, mean levels of CpG methylation are compared across categories using linear regression[9] which allows for the adjustment of confounders and batch effects.

Another method of analysis is using unsupervised clustering to create classes of CpG sites based on similarity of methylation variation across samples.

This method is useful for identifying gross patterns of methylation associated with the tested variable, but may miss specific CpG sites of interest.

Enrichment analysis based on the genomic region has thus been suggested as a complementary approach and confers substantial interpretive potential.

This provides a measure of effect size that incorporates relative magnitudes, but also does not allow for the difference between cases and controls of features of the methylation spectrum, such as variance.

The methylation odds ratio is also comparable across prospective and retrospective studies and its value only measures association and does not imply causation.

In the current implementation, EWAS Atlas focuses on DNA methylation—one of the key epigenetic marks; it integrates a large number of 388,851 high-quality EWAS associations, involving 126 tissues/cell lines and covering 351 traits, 2,230 cohorts and 390 ontology entities, which are completely based on manual curation from 649 studies reported in 495 publications.

In addition, it is equipped with a powerful trait enrichment analysis tool, which is capable of profiling trait-trait and trait-epigenome relationships.

Accordingly, taking advantages of both massive high-quality DNA methylation data and standardized metadata, EWAS Data Hub provides reference DNA methylation profiles under different contexts, involving 81 tissues/cell types (that contain 25 brain parts and 25 blood cell types), six ancestry categories, and 67 diseases (including 39 cancers).

In summary, EWAS Data Hub bears great promise to aid the retrieval and discovery of methylation-based biomarkers for phenotype characterization, clinical treatment and health care.

EWAS workflow
Methylation assay workflow. From: Illumina Methylation Assay
Work flow for EWA study