SAMPL Challenge

SAMPL (Statistical Assessment of the Modeling of Proteins and Ligands) is a set of community-wide blind challenges aimed to advance computational techniques as standard predictive tools in rational drug design.

The most recent SAMPL5 challenge contains two prediction categories: the binding affinity of host–guest systems, and the distribution coefficients of drug-like molecules between water and cyclohexane.

[11] The SAMPL challenge seeks to accelerate progress in developing quantitative, accurate drug discovery tools by providing prospective validation and rigorous comparisons for computational methodologies and force fields.

To overcome this, SAMPL challenges have been organized as blind tests: each time new datasets are carefully designed and collected from academic or industrial research laboratories, and measurements are released shortly after the deadline of prediction submission.

[further explanation needed][12] The past several SAMPL host–guest, hydration free energy and log D challenges revealed the limitations in generalized force fields,[13][14] facilitated the development of solvent models,[15][16] and highlighted the importance of properly handling protonation states and salt effects.

[9][10] The effort is spearheaded by David L. Mobley (UC Irvine) with co-investigators John D. Chodera (MSKCC), Bruce C. Gibb (Tulane), and Lyle Isaacs (Maryland).

Currently challenges and workshops are run in partnership with the NIH-funded Drug Design Data Resource, but this will likely change over time as funding for the two projects is not coupled.

Funding also allowed a broadening of scope of SAMPL; through SAMPL6, its role had been seen as primarily focused on physical properties, with D3R handling protein-ligand challenges.

[23] SAMPL5 allowed participants to make predictions of the binding affinities of three sets of host–guest systems: an acyclic CB7 derivative and two host from the octa-acid family.

A wide array of computational methods were tested, including density functional theory (DFT), molecular dynamics, docking, and metadynamics.

The distribution coefficient predictions were introduced for the first time, receiving total of 76 submissions from 18 researcher groups or scientists for a set of 53 small molecules.

The top-performing methods in the host–guest challenge yielded encouraging yet imperfect correlations with experimental data, accompanied by large, systematic shifts relative to experiment.

SAMPL8 included host-guest components on binding of drugs of abuse to CB8, and a series of small molecules to Gibb Deep Cavity Cavitands (GDCCs), as detailed on the SAMPL8 GitHub repository.

[9] Some data is planned to be collected directly by the SAMPL co-investigators (Chodera, Gibb and Isaacs), but industry partnerships and internships are also proposed.