Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids.
These models, which provide 3D coordinates for each atom in the molecule (see example in the image), come from structural biology experiments such as x-ray crystallography[1] or nuclear magnetic resonance (NMR).
Proteins and nucleic acids are the workhorses of biology, providing the necessary chemical reactions, structural organization, growth, mobility, reproduction, and environmental sensitivity.
It included Rfree cross-validation for model-to-data match,[6] bond length and angle parameters for covalent geometry,[7] and sidechain and backbone conformational criteria.
[20] Though, mRNA structures are generally short-lived and single-stranded, there are an abundance of non-coding RNAs with different secondary and tertiary folding (tRNA, rRNA etc.)
Since, RNA-helices are small in length (average: 10-20 bps), the use of electrostatic surface potential as a validation parameter [23] has been found to be beneficial, particularly for modelling purposes.
For globular proteins, interior atomic packing (arising from short-range, local interactions) of side-chains[24][25][26][27] has been shown to be pivotal in the structural stabilization of the protein-fold.
While the clash-score of Molprobity identifies steric clashes at a very high resolution, the Complementarity Plot combines packing anomalies with electrostatic imbalance of side-chains and signals for either or both.
[33] Privateer also generates scalable two-dimensional SVG diagrams according to the Essentials of Glycobiology[34] standard symbol nomenclature containing all the validation information as tooltip annotations (see figure).
Many evaluation criteria apply globally to an entire experimental structure, most notably the resolution, the anisotropy or incompleteness of the data, and the residual or R-factor that measures overall model-to-data match (see below).
Cyro-EM presents special challenges to model-builders as the observed electron density is frequently insufficient to resolve individual atoms, leading to a higher likelihood of errors.
There is great interest in the development of reliable validation standards for SAXS data interpretation and for quality of the resulting models, but there are as yet no established methods in general use.
[58] The major criterion for CASP evaluation is a weighted score called GDT-TS for the match of Calpha positions between the predicted and the experimental models.