Proteogenomics

Proteogenomics is a field of biological research that utilizes a combination of proteomics, genomics, and transcriptomics to aid in the discovery and identification of peptides.

The utilization of both proteomics and genomics data alongside advances in the availability and power of spectrographic and chromatographic technology led to the emergence of proteogenomics as its own field in 2004.

In addition, the emergence of novel protein sequences due to mutations often cannot be accounted for in traditional proteomic databases, but can be predicted and studied using a synthesis of genomic and transcriptomic data.

The resulting research has applications in improving gene annotations, studying mutations, and understanding the effects of genetic manipulation.

More recently, the joint profiling of surface proteins and mRNA transcripts from single cells by methods such as CITE-Seq and ESCAPE [1] has been referred to as single-cell proteogenomics,[2][3][4] although the goals of these studies are not related to peptide identification.

[5] Proteogenomics emerged as an independent field in 2004, based on the integration of technological advancements in next-generation sequencing genomics, and mass spectrometry proteomics.

[6] The term itself came into use that year, with the publication of a paper by George Church’s research group describing their discovery of a proteogenomic mapping technique that utilized proteomics data to better annotate the genome of the bacteria M. pneumoniae.

The resulting map proved extremely accurate, with over 81% of predicted genomic reading frames being detected in the bacterial cells studied.

[7][8] The field expanded over the next two decades, initially using proteomics data to aid in refining genetic models via protein databases.

For example, various microorganisms have had their genomic annotation studied through the proteogenomic approach including, Escherichia coli, Mycobacterium, and multiple species of Shewanella bacteria.

[14] Besides improving gene annotations, proteogenomic studies can also provide valuable information about the presence of programmed frameshifts, N-terminal methionine excision, signal peptides, proteolysis and other post-translational modifications.

In addition to direct applications in cancer treatment and diagnosis, a proteogenomic approach can be used to study proteins that result in resistance to chemotherapy.

Proteogenomics uses an integrated approach by combining genomics , proteomics , and transcriptomics .
Image of a eukaryote cell illustrating how proteins are made: DNA in the nucleus is read by RNA polymerase, then ribosomes in the cytoplasm produce an amino acid strand that folds into a functional protein.