[4] In general, the introduction of new functional unnatural amino acids into proteins of living cells breaks the universality of the genetic language, which ideally leads to alternative life forms.
These allow a large repertoire of new functions, such as labeling (see figure), as a fluorescent reporter (e.g. dansylalanine)[11] or to produce translational proteins in E. coli with Eukaryotic post-translational modifications (e.g. phosphoserine, phosphothreonine, and phosphotyrosine).
In the second case, a biosynthetic pathway needs to be engineered, for example, an E. coli strain that biosynthesizes a novel amino acid (p-aminophenylalanine) from basic carbon sources and includes it in its genetic code.
The genetic code has a non-random layout that shows tell-tale signs of various phases of primordial evolution, however, it has since frozen into place and is near-universally conserved.
Later, in the Schultz lab, the tRNATyr/tyrosyl-tRNA synthetase (TyrRS) from Methanococcus jannaschii, an archaebacterium,[6] was used to introduce a tyrosine instead of STOP, the default value of the amber codon.
One approach pioneered by the group of Prof. George Church from Harvard, was dubbed MAGE in CAGE: this relied on a multiplex transformation and subsequent strain recombination to remove all UAG codons—the latter part presented a halting point in a first paper,[27] but was overcome.
[28] This allowed an experiment to be done with this strain to make it "addicted" to the amino acid biphenylalanine by evolving several key enzymes to require it structurally, therefore putting its expanded genetic code under positive selection.
[30] Another candidate is the AUA codon, which is unusual in that its respective tRNA has to differentiate against AUG that codes for methionine (primordially, isoleucine, hence its location).
The reduced fitness is a first step towards pressuring the strain to lose all instances of AUA, allowing it to be used for genetic code expansion.
[33] Recent developments in genetic code engineering also showed that quadruplet codon could be used to encode non-standard amino acids under experimental conditions.
[34][35][36] This allowed the simultaneous usage of two unnatural amino acids, p-azidophenylalanine (pAzF) and N6-[(2-propynyloxy)carbonyl]lysine (CAK), which cross-link with each other by Huisgen cycloaddition.
[38] This problem can be overcome by specifically engineering and evolving tRNA that can decode quadruplet codons in non-recoded strains.
Mutations to the plasmid containing the pair can be introduced by error-prone PCR or through degenerate primers for the synthetase's active site.
Selection involves multiple rounds of a two-step process, where the plasmid is transferred into cells expressing chloramphenicol acetyl transferase with a premature amber codon.
Orthogonal ribosomes ideally use different mRNA transcripts than their natural counterparts and ultimately should draw on a separate pool of tRNA as well.
This ribosome did not eliminate the problem of lowered cell fitness caused by suppressed stop codons in natural proteins.
[citation needed] In 2014, it was shown that by altering the peptidyl transferase center of the 23S rRNA, ribosomes could be created which draw on orthogonal pools of tRNA.
Thus far, this system has only been shown to work in an in-vitro translation setting where the aminoacylation of the orthogonal tRNA was achieved using so called "flexizymes".
The ability to site-specifically direct lab-synthesized chemical moieties into proteins allows many types of studies that would otherwise be extremely difficult, such as: The expansion of the genetic code is still in its infancy.
In fact, the group of Jason Chin has recently broken the record for a genetically recoded E. coli strain that can simultaneously incorporate up to 4 unnatural amino acids.
[73] Moreover, there has been development in software that allows combination of orthogonal ribosomes and unnatural tRNA/RS pairs in order to improve protein yield and fidelity.
In 2002, they developed an unnatural base pair between 2-amino-8-(2-thienyl)purine (s) and pyridine-2-one (y) that functions in vitro in transcription and translation for the site-specific incorporation of non-standard amino acids into proteins.
[80] In 2012, a group of American scientists led by Floyd Romesberg, a chemical biologist at the Scripps Research Institute in San Diego, California, published that his team designed an unnatural base pair (UBP).
More technically, these artificial nucleotides bearing hydrophobic nucleobases, feature two fused aromatic rings that form a (d5SICS–dNaM) complex or base pair in DNA.
[82][83] In 2014 the same team from the Scripps Research Institute reported that they synthesized a stretch of circular DNA known as a plasmid containing natural T-A and C-G base pairs along with the best-performing UBP Romesberg's laboratory had designed, and inserted it into cells of the common bacterium E. coli that successfully replicated the unnatural base pairs through multiple generations.
[84] The artificial strings of DNA do not encode for anything yet, but scientists speculate they could be designed to manufacture new proteins which could have industrial or pharmaceutical uses.
[98] In this context, the SPI method generates recombinant protein variants or alloproteins directly by substitution of natural amino acids with unnatural counterparts.
[100] This approach avoids the pitfalls of suppression-based methods[101] and it is superior to it in terms of efficiency, reproducibility and an extremely simple experimental setup.
[102] Numerous studies demonstrated how global substitution of canonical amino acids with various isosteric analogs caused minimal structural perturbations but dramatic changes in thermodynamic,[103] folding,[104] aggregation[105] spectral properties[106][107] and enzymatic activity.
[citation needed] In November 2017, a team from the Scripps Research Institute reported having constructed a semi-synthetic E. coli bacteria genome using six different nucleotides (versus four found in nature).