Virtual screening

[2][3] Virtual screening has been defined as "automatically evaluating very large libraries of compounds" using computer programs.

As the accuracy of the method has increased, virtual screening has become an integral part of the drug discovery process.

[16] Structure-based virtual screening approach includes different computational techniques that consider the structure of the receptor that is the molecular target of the investigated active ligands.

[22][23] Hybrid methods that rely on structural and ligand similarity were also developed to overcome the limitations of traditional VLS approaches.

[26][27] The predictions from this method have been experimentally assessed and shows good enrichment in identifying active small molecules.

The above specified method depends on global structural similarity and is not capable of a priori selecting a particular ligand‐binding site in the protein of interest.

The computation of pair-wise interactions between atoms, which is a prerequisite for the operation of many virtual screening programs, scales by

The size of the task requires a parallel computing infrastructure, such as a cluster of Linux systems, running a batch queue processor to handle the work, such as Sun Grid Engine or Torque PBS.

Furthermore, it may not be efficient to run one comparison per job, because the ramp up time of the cluster nodes could easily outstrip the amount of useful work.

To work around this, it is necessary to process batches of compounds in each cluster job, aggregating the results into some kind of log file.

The aim of virtual screening is to identify molecules of novel chemical structure that bind to the macromolecular target of interest.

Thus, success of a virtual screen is defined in terms of finding interesting new scaffolds rather than the total number of hits.

[29][30] By contrast, in prospective applications of virtual screening, the resulting hits are subjected to experimental confirmation (e.g., IC50 measurements).

[37] It is preferred to have multiple rigid molecules and the ligands should be diversified, in other words ensure to have different features that don't occur during the binding phase.

[1] Shape-based molecular similarity approaches have been established as important and popular virtual screening techniques.

Supervised learning techniques use a training and test datasets composed of known active and known inactive compounds.

Different ML algorithms have been applied with success in virtual screening strategies, such as recursive partitioning, support vector machines, random forest, k-nearest neighbors and neural networks.

Understanding the way rules break classes up with a low error of misclassification while repeating each step until no sensible splits can be found.

Figure 1. Flow chart of virtual screening [ 1 ]