This requires new ways of communicating this structural information to the broader research community.
These proteins are then purified and crystallized, and then subjected to one of two types of structure determination: X-ray crystallography and nuclear magnetic resonance (NMR).
The whole genome sequence allows for the design of every primer required in order to amplify all of the ORFs, clone them into bacteria, and then express them.
One highly successful method for ab initio modeling is the Rosetta program, which divides the protein into short segments and arranges short polypeptide chain into a low-energy local conformation.
Highly accurate modeling is considered to require at least 50% amino acid sequence identity between the unknown protein and the solved structure.
Threading bases structural modeling on fold similarities rather than sequence identity.
This method may help identify distantly related proteins and can be used to infer molecular functions.
There are currently a number of on-going efforts to solve the structures for every protein in a given proteome.
T. maritima was selected as a structural genomics target based on its relatively small genome consisting of 1,877 genes and the hypothesis that the proteins expressed by a thermophilic bacterium would be easier to crystallize.
Lesley et al used Escherichia coli to express all the open-reading frames (ORFs) of T. martima.
The fully sequenced genome of M. tuberculosis has allowed scientists to clone many of these protein targets into expression vectors for purification and structure determination by X-ray crystallography.