Recently, systematic attempts have been made to identify those genes that are absolutely required to maintain life, provided that all nutrients are available.
Essential genes of single-celled organisms encode proteins for three basic functions including genetic information processing, cell envelopes and energy production.
In transposon-mediated mutagenesis, transposons are randomly inserted in as many positions in a genome as possible, aiming to disrupt the function of the targeted genes (see figure below).
Notes: (a) mutant collection available; (b) direct essentiality screening method (e.g. via antisense RNA) that does not provide information about nonessential genes.
On the basis of genome-wide experimental studies and systems biology analysis, an essential gene database has been developed by Kong et al. (2019) for predicting > 4000 bacterial species.
In Schizosaccharomyces pombe (fission yeast) 4,836 heterozygous deletions covering 98.4% of the 4,914 protein coding open reading frames have been constructed.
[43] Similar screens are more difficult to carry out in other multicellular organisms, including mammals (as a model for humans), due to technical reasons, and their results are less clear.
[48] Note that many genes in humans are not absolutely essential for survival but can cause severe disease when mutated.
In a computational analysis of genetic variation and mutations in 2,472 human orthologs of known essential genes in the mouse, Georgi et al. found strong, purifying selection and comparatively reduced levels of sequence variation, indicating that these human genes are essential too.
[60] Tscharke and Dobson (2015) compiled a comprehensive survey of essential genes in Vaccinia Virus and assigned roles to each of the 223 ORFs of the Western Reserve (WR) strain and 207 ORFs of the Copenhagen strain, assessing their role in replication in cell culture.
According to their definition, a gene is considered essential (i.e. has a role in cell culture) if its deletion results in a decrease in virus titre of greater than 10-fold in either a single or multiple step growth curve.
By this definition 93 genes are required for Vaccinia Virus replication in cell culture, while 108 and 94 ORFs, from WR and Copenhagen respectively, are non-essential.
[61] Vaccinia viruses with deletions at either end of the genome behaved as expected, exhibiting only mild or host range defects.
In contrast, combining deletions at both ends of the genome for VACV strain WR caused a devastating growth defect on all cell lines tested.
[1] However, depending on the surrounding environment, certain essential gene mutants may show partial functions, which can be quantitatively determined in some studies.
[1] Using CRISPR interference, the expression of essential genes can be modulated or "tuned", leading to quantitative (or continuous) relationships between the level of gene-expression and the magnitude of fitness cost exhibited by a given mutant.
[4] Streptococcus pneumoniae appears to require 147 genes for growth and survival in saliva,[66] more than the 113-133 that have been found in previous studies.
[69] Another kind of metabolic dependency, unrelated to cross-species interactions, can be found when bacteria are grown under specific nutrient conditions.
Specifically, isocitrate dehydrogenase (icd) and citrate synthase (gltA) are two enzymes that are part of the tricarboxylic acid (TCA) cycle.
Fang et al. found 257 persistent genes, which exist both in B. subtilis (for the Bacillota) and E. coli (for the Gamma-proteobacteria).
In diploid organisms, only a single functional copy of some essential genes may be needed (haplosufficiency), with the heterozygote displaying an instructive phenotype.
Screens to identify essential genes in the human chronic myelogenous leukemia cell line K562 with these two methods showed only limited overlap.
[1] Such different essential genes in bacteria can be used to develop targeted antibacterial therapies against certain specific pathogens to reduce antibiotic resistance in the microbiome era.
Liu et al. (2015)[93] used the Hurst exponent, a characteristic parameter to describe long-range correlation in DNA to predict essential genes.
The problem here is that the smallest genomes belong to parasitic (or symbiontic) species which can survive with a reduced gene set as they obtain many nutrients from their hosts.
For instance, one of the smallest genomes is that of Hodgkinia cicadicola, a symbiont of cicadas, containing only 144 Kb of DNA encoding only 188 genes.
Song et al. presented a novel method to predict essential genes that only uses the Z-curve and other sequence-based features.
Guo et al. (2015) have developed three online services to predict essential genes in bacterial genomes.
[98] Kong et al. (2019) have developed the ePath database, which can be used to search > 4000 bacterial species for predicting essential genes.
[99] Lu et al.[100] presented a similar approach and identified 3,450 domains that are essential in at least one microbial species.