Gene co-expression network

In principle, they all follow a two step approach: calculating co-expression measure, and selecting significance threshold.

For instance, in a microarray experiment the expression values of thousands of genes are measured for several samples.

In first step, a similarity score (co-expression measure) is calculated between each pair of rows in expression matrix.

Furthermore, if two genes have consistently low expression levels but are otherwise randomly correlated, they might still appear close in Euclidean space.

Spearman’s rank correlation coefficient is more robust to outliers, but on the other hand it is less sensitive to expression values and in datasets with small number of samples may detect many false positives.

Moreover, Pearson correlation assumes that the gene expression data follow a normal distribution.

Song et al.[11] have suggested biweight midcorrelation (bicor) as a good alternative for Pearson’s correlation.

Another approach is to use Fisher’s Z-transformation which calculates a z-score for each correlation based on the number of samples.

[2] Some other approaches have also been used such as threshold selection based on clustering coefficient[12] or random matrix theory.

[15][16][17] lmQCM is an alternative for WGCNA achieving the same goal of gene co-expression networks analysis.

The generally smaller size of mined modules can also generate more meaningful gene ontology (GO) enrichment results.

[19] Second, as discussed in the previous sections, each co-expression computational measure is designed specifically to capture a unique feature that is not necessarily optimal for depicting all types of gene-to-gene transcriptional inter-relation, for example, Pearson correlation for linear relations, Spearman for the ranking of the genes, and so on.

These challenges should be referred when applying advanced methods of co-expression on gene expression data.