Structure validation

Macromolecular structure validation is the process of evaluating reliability for 3-dimensional atomic models of large biological molecules such as proteins and nucleic acids.

These models, which provide 3D coordinates for each atom in the molecule (see example in the image), come from structural biology experiments such as x-ray crystallography[1] or nuclear magnetic resonance (NMR).

Proteins and nucleic acids are the workhorses of biology, providing the necessary chemical reactions, structural organization, growth, mobility, reproduction, and environmental sensitivity.

It included Rfree cross-validation for model-to-data match,[6] bond length and angle parameters for covalent geometry,[7] and sidechain and backbone conformational criteria.

[20] Though, mRNA structures are generally short-lived and single-stranded, there are an abundance of non-coding RNAs with different secondary and tertiary folding (tRNA, rRNA etc.)

Since, RNA-helices are small in length (average: 10-20 bps), the use of electrostatic surface potential as a validation parameter [23] has been found to be beneficial, particularly for modelling purposes.

For globular proteins, interior atomic packing (arising from short-range, local interactions) of side-chains[24][25][26][27] has been shown to be pivotal in the structural stabilization of the protein-fold.

While the clash-score of Molprobity identifies steric clashes at a very high resolution, the Complementarity Plot combines packing anomalies with electrostatic imbalance of side-chains and signals for either or both.

[33] Privateer also generates scalable two-dimensional SVG diagrams according to the Essentials of Glycobiology[34] standard symbol nomenclature containing all the validation information as tooltip annotations (see figure).

Many evaluation criteria apply globally to an entire experimental structure, most notably the resolution, the anisotropy or incompleteness of the data, and the residual or R-factor that measures overall model-to-data match (see below).

Cyro-EM presents special challenges to model-builders as the observed electron density is frequently insufficient to resolve individual atoms, leading to a higher likelihood of errors.

There is great interest in the development of reliable validation standards for SAXS data interpretation and for quality of the resulting models, but there are as yet no established methods in general use.

[58] The major criterion for CASP evaluation is a weighted score called GDT-TS for the match of Calpha positions between the predicted and the experimental models.

Structure validation concept: model of a protein (each ball is an atom), and magnified region with electron density data and 3 bright flags for problems
A 2D diagram of an N-glycan linked to an antibody fragment in the structure with PDB accession code ' 4BYH ​'. This diagram, which has been generated with Privateer, [ 33 ] follows the standard symbol nomenclature [ 34 ] and includes, in its original svg format, annotations containing validation information, including ring conformation and detected monosaccharide types.
What can be seen in low vs high resolution macromolecular crystal structures
NMR structural ensemble for PDB file 2K5D, with well-defined structure for the beta strands (arrows) and undefined, presumably highly mobile regions for the orange loop and the blue N-terminus