AlphaFold is an artificial intelligence (AI) program developed by DeepMind, a subsidiary of Alphabet, which performs predictions of protein structure.
On 15 July 2021, the AlphaFold 2 paper was published in Nature as an advance access publication alongside open source software and a searchable database of species proteomes.
"[16] Hassabis and Jumper had previously won the Breakthrough Prize in Life Sciences and the Albert Lasker Award for Basic Medical Research in 2023 for their leadership of the AlphaFold project.
Protein structures can be determined experimentally through techniques such as X-ray crystallography, cryo-electron microscopy and nuclear magnetic resonance, which are all expensive and time-consuming.
The program uses a form of attention network, a deep learning technique that focuses on having the AI identify parts of a larger problem, then piece it together to obtain the overall solution.
AlphaFold 2 replaced this with a system of interconnected sub-networks, forming a single, differentiable, end-to-end model based on pattern recognition.
[24][25] After the neural network's prediction converges, a final refinement step applies local physical constraints using energy minimization based on the AMBER force field.
In an example presented by DeepMind, the structure prediction module achieved a correct topology for the target protein on its first iteration, scored as having a GDT_TS of 78, but with a large number (90%) of stereochemical violations – i.e. unphysical bond angles or lengths.
This model begins with a cloud of atoms and iteratively refines their positions, guided by the Pairformer's output, to generate a 3D representation of the molecular structure.
[33] In December 2018, DeepMind's AlphaFold placed first in the overall rankings of the 13th Critical Assessment of Techniques for Protein Structure Prediction (CASP).
[42][19] but, as stated in the "Read Me" file on that website: "This code can't be used to predict structure of an arbitrary protein sequence.
[6] On the competition's preferred global distance test (GDT) measure of accuracy, the program achieved a median score of 92.4 (out of 100), meaning that more than half of its predictions were scored at better than 92.4% for having their atoms in more-or-less the right place,[45][46] a level of accuracy reported to be comparable to experimental techniques like X-ray crystallography.
To further validate AlphaFold 2, the conference organizers approached four leading experimental groups working on structures they found particularly challenging and had been unable to determine.
In all four cases the three-dimensional models produced by AlphaFold 2 were sufficiently accurate to determine structures of these proteins by molecular replacement.
The third exists in nature as a multidomain complex consisting of 52 identical copies of the same domain, a situation AlphaFold was not programmed to consider.
For all targets with a single domain, excluding only one very large protein and the two structures determined by NMR, AlphaFold 2 achieved a GDT_TS score of over 80.
[7] Nobel Prize winner and structural biologist Venki Ramakrishnan called the result "a stunning advance on the protein folding problem",[5] adding that "It has occurred decades before many people in the field would have predicted.
[52][53][54][55] A frequent theme was that ability to predict protein structures accurately based on the constituent amino acid sequence is expected to have a wide variety of benefits in the life sciences space including accelerating advanced drug discovery and enabling better understanding of diseases.
[57] In 2023, Demis Hassabis and John Jumper won the Breakthrough Prize in Life Sciences[18] as well as the Albert Lasker Award for Basic Medical Research for their management of the AlphaFold project.
[58] Hassabis and Jumper proceeded to win the Nobel Prize in Chemistry in 2024 for their work on “protein structure prediction” with David Baker of the University of Washington.
[78][7] Results were reviewed by scientists at the Francis Crick Institute in the United Kingdom before being released to the broader research community.
[79] The team acknowledged that although these protein structures might not be the subject of ongoing therapeutical research efforts, they will add to the community's understanding of the SARS-CoV-2 virus.