Metabolomics is the scientific study of chemical processes involving metabolites, the small molecule substrates, intermediates, and products of cell metabolism.
[7] The term "metabolic profile" was introduced by Horning, et al. in 1971 after they demonstrated that gas chromatography-mass spectrometry (GC-MS) could be used to measure compounds present in human urine and tissue extracts.
[8][9] The Horning group, along with that of Linus Pauling and Arthur B. Robinson led the development of GC-MS methods to monitor the metabolites present in urine through the 1970s.
As sensitivity has improved with the evolution of higher magnetic field strengths and magic angle spinning, NMR continues to be a leading analytical tool to investigate metabolism.
[13][14] In 1994 and 1996, liquid chromatography mass spectrometry metabolomics experiments[15][16] were performed by Gary Siuzdak while working with Richard Lerner (then president of the Scripps Research Institute) and Benjamin Cravatt, to analyze the cerebral spinal fluid from sleep deprived animals.
In 2005, the first metabolomics tandem mass spectrometry database, METLIN,[17][18] for characterizing human metabolites was developed in the Siuzdak laboratory at the Scripps Research Institute.
Bio-specimens used for metabolomics analysis include but not limit to plasma, serum, urine, saliva, feces, muscle, sweat, exhaled breath and gastrointestinal fluid.
[citation needed] Metabonomics is defined as "the quantitative measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification".
It uses many techniques from other subfields of metabolomics, and has applications in biofuel development, bioprocessing, determining drugs' mechanism of action, and studying intercellular interactions.
Many bioinformatic tools and software are available to identify associations with disease states and outcomes, determine significant correlations, and characterize metabolic signatures with existing biological knowledge.
EI also produces fragmentation of the analyte, both providing structural information while increasing the complexity of the data and possibly obscuring the molecular ion.
In the 2000s, surface-based mass analysis has seen a resurgence, with new MS technologies focused on increasing sensitivity, minimizing background, and reducing sample preparation.
Among the technologies being developed to address this challenge is Nanostructure-Initiator MS (NIMS),[56][57] a desorption/ ionization approach that does not require the application of matrix and thereby facilitates small-molecule (i.e., metabolite) identification.
The primary advantage of SIMS is its high spatial resolution (as small as 50 nm), a powerful characteristic for tissue imaging with MS.
However, SIMS has yet to be readily applied to the analysis of biofluids and tissues because of its limited sensitivity at >500 Da and analyte fragmentation generated by the high-energy primary ion beam.
Desorption electrospray ionization (DESI) is a matrix-free technique for analyzing biological samples that uses a charged solvent spray to desorb ions from a surface.
Advantages of DESI are that no special surface is required and the analysis is performed at ambient pressure with full access to the sample during acquisition.
[58] Nuclear magnetic resonance (NMR) spectroscopy is the only detection technique which does not rely on separation of the analytes, and the sample can thus be recovered for further analyses.
These include Fourier-transform ion cyclotron resonance,[61] ion-mobility spectrometry,[62] electrochemical detection (coupled to HPLC), Raman spectroscopy and radiolabel (when combined with thin-layer chromatography).
[63] For mass spectrometry data, software is available that identifies molecules that vary in subject groups on the basis of mass-over-charge value and sometimes retention time depending on the experimental design.
The most common of these methods includes principal component analysis (PCA) which can efficiently reduce the dimensions of a dataset to a few which explain the greatest variation.
On the other hand, multivariate statistics are thriving methods for high-dimensional correlated metabolomics data, of which the most popular one is Projection to Latent Structures (PLS) regression and its classification version PLS-DA.
[66] In the case of univariate methods, variables are analyzed one by one using classical statistics tools (such as Student's t-test, ANOVA or mixed models) and only these with sufficient small p-values are considered relevant.
[36] However, correction strategies should be used to reduce false discoveries when multiple comparisons are conducted since there is no standard method for measuring the total amount of metabolites directly in untargeted metabolomics.
These tools allow researchers to apply artificial intelligence to the retention time prediction of small molecules in complex mixture, such as human plasma, plant extracts, foods, or microbial cultures.
Retention time prediction increases the identification rate in liquid chromatography and can lead to an improved biological interpretation of metabolomics data.
[48] For functional genomics, metabolomics can be an excellent tool for determining the phenotype caused by a genetic manipulation, such as gene deletion or insertion.
Sometimes this can be a sufficient goal in itself—for instance, to detect any phenotypic changes in a genetically modified plant intended for human or animal consumption.
The Cravatt laboratory at the Scripps Research Institute has recently applied this technology to mammalian systems, identifying the N-acyltaurines as previously uncharacterized endogenous substrates for the enzyme fatty acid amide hydrolase (FAAH) and the monoalkylglycerol ethers (MAGEs) as endogenous substrates for the uncharacterized hydrolase KIAA1363.
[71] This bioinformatics-based pairing method enables natural product discovery at a larger-scale by refining non-targeted metabolomic analyses to identify small molecules with related biosynthesis and to focus on those that may not have previously well known structures.