Biological network inference

[1] By using these networks to analyze patterns in biological systems, such as food-webs, we can visualize the nature and strength of these interactions between species, DNA, proteins, and more.

[4] Good network inference requires proper planning and execution of an experiment, thereby ensuring quality data acquisition.

Optimal experimental design in principle refers to the use of statistical and or mathematical concepts to plan for data acquisition.

This article focuses on inference of biological network structure using the growing sets of high-throughput expression data for genes, proteins, and metabolites.

[10] Briefly, methods using high-throughput data for inference of regulatory networks rely on searching for patterns of partial correlation or conditional probabilities that indicate causal influence.

[12] Clustering or some form of statistical classification is typically employed to perform an initial organization of the high-throughput mRNA expression values derived from microarray experiments, in particular to select sets of genes as candidates for network nodes.

Such results can be useful for pattern classification – for example, to classify subtypes of cancer, or to predict differential responses to a drug (pharmacogenomics).

Signal transduction networks use proteins for the nodes and directed edges to represent interaction in which the biochemical conformation of the child is modified by the action of the parent (e.g. mediated by phosphorylation, ubiquitylation, methylation, etc.).

PINs can be discovered with a variety of methods including; Two-hybrid Screening, in vitro: co-immunoprecipitation,[15] blue native gel electrophoresis,[16] and more.

It also allows us to quantify associations between individuals, which makes it possible to infer details about the network as a whole at the species and/or population level.

These interactions can be understood by analyzing commonalities amongst different loci, a fixed position on a chromosome where a particular gene or genetic marker is located.

Data can be sourced in multiple ways to include manual curation of scientific literature put into databases, High-throughput datasets, computational predictions, and text mining of old scholarly articles from before the digital era.

We can do this via contextual biological information, counting the number of times an interaction is reported in the literature, or group different strategies into a single score.

By measuring the attributes in the previous section we can utilize many different techniques to create accurate inferences based on biological data.

The term encompasses an entire class of techniques such as network motif search, centrality analysis, topological clustering, and shortest paths.

By counting all the possible instances, listing all patterns, and testing isomorphisms we can derive crucial information about a network.

The computational research has focused on improving existing motif detection tools to assist the biological investigations and allow larger networks to be analyzed.

[25] Therefore, the topological descriptors should be defined as random variable with the associated probability distribution encoding the uncertainty on their value.

[26] This technique has been used for progression analysis of disease,[27][28] viral evolution,[29] propagation of contagions on networks,[30] bacteria classification using molecular spectroscopy,[31] and much more in and outside of biology.

Annotation Enrichment Analysis (AEA) is used to overcome biases from overlap statistical methods used to assess these associations.