A typical pathway model starts with an extracellular signaling molecule that activates a specific receptor, thus triggering a chain of molecular interactions.
[4][5] In the simplest form, however, a pathway might be represented as a list of member molecules with order and relations unspecified.
The general approach of enrichment analyses is to identify FGSs, members of which were most frequently or most strongly altered in the given condition, in comparison to a gene set sampled by chance.
In other words, enrichment can map canonical prior knowledge structured in the form of FGSs to the condition represented by altered genes.
Pathway content, structure, format, and functionality vary between different database resources such as KEGG,[15] WikiPathways, or Reactome.
Public online tools can provide pre-compiled and ready-to-go menus of pathways and networks from different open sources (e.g. EviNet).
Pathway analysis software can be found in the form of desktop programs, web-based applications, or packages coded in such languages as R and Python and shared openly through the BioConductor[19] and GitHub[20] projects.
The basic assumption behind ORA is that a biologically relevant pathway can be identified by excess of AGS genes in it compared to the number expected by chance.
[23] In addition, specific topological information is used about role, position, and interaction directions of the pathway genes.
This requires additional input data from a pathway database in a pre-specified format, such as KEGG Markup Language (KGML).
Since enrichment significance is influenced by the highly variable node degrees of individual AGS and FGS genes, it should be determined by a dedicated statistical test, which compares the observed number of network edges to the number expected by chance in the same network context.
Some valuable properties of NEA are that: Beyond open-source tools, such as STRING or Cytoscape, a number of companies sell licensed software products to analyse gene sets.