Gene expression

The process of gene expression is used by all known life—eukaryotes (including multicellular organisms), prokaryotes (bacteria and archaea), and utilized by viruses—to generate the macromolecular machinery for life.

Such phenotypes are often displayed by the synthesis of proteins that control the organism's structure and development, or that act as enzymes catalyzing specific metabolic pathways.

All steps in the gene expression process may be modulated (regulated), including the transcription, RNA splicing, translation, and post-translational modification of a protein.

Because these transcripts can be potentially translated into different proteins, splicing extends the complexity of eukaryotic gene expression and the size of a species proteome.

The pre-rRNA is cleaved and modified (2′-O-methylation and pseudouridine formation) at specific sites by approximately 150 different small nucleolus-restricted RNA species, called snoRNAs.

After being exported, it is then processed to mature miRNAs in the cytoplasm by interaction with the endonuclease Dicer, which also initiates the formation of the RNA-induced silencing complex (RISC), composed of the Argonaute protein.

[29] Each protein exists as an unfolded polypeptide or random coil when translated from a sequence of mRNA into a linear chain of amino acids.

[30] Amino acids interact with each other to produce a well-defined three-dimensional structure, the folded protein (the right hand side of the figure) known as the native state.

[42][43] Protein degradation is a major regulatory mechanism of gene expression[44][45] and contributes substantially for shaping proteomes, especially of tissues and cells that do not grow very fast.

More generally, gene regulation gives the cell control over all structure and function, and is the basis for cellular differentiation, morphogenesis and the versatility and adaptability of any organism.

In general gene expression is regulated through changes[47] in the number and type of interactions between molecules[48] that collectively influence transcription of DNA[49] and translation of RNA.

[56] The activity of transcription factors is further modulated by intracellular signals causing protein post-translational modification including phosphorylation, acetylation, or glycosylation.

[65] In eukaryotes the structure of chromatin, controlled by the histone code, regulates access to DNA with significant impacts on the expression of genes in euchromatin and heterochromatin areas.

[73] Enhancers, when active, are generally transcribed from both strands of DNA with RNA polymerases acting in two different directions, producing two eRNAs as illustrated in the figure.

The pattern of induced and repressed genes within neurons appears to provide a molecular basis for forming the first transient memory of this training event in the hippocampus of the rat brain.

[96] By binding to specific sites within the 3′-UTR, miRNAs can decrease gene expression of various mRNAs by either inhibiting translation or directly causing degradation of the transcript.

[109] Inhibition of protein translation is a major target for toxins and antibiotics, so they can kill a cell by overriding its normal gene expression control.

[112] Proteolysis, other than being involved in breaking down proteins, is also important in activating and deactivating them, and in regulating biological processes such as DNA transcription and cell death.

[119] Because the use of radioactive reagents makes the procedure time-consuming and potentially dangerous, alternative labeling and detection methods, such as digoxigenin and biotin chemistries, have been developed.

[120] Perceived disadvantages of Northern blotting are that large quantities of RNA are required and that quantification may not be completely accurate, as it involves measuring band strength in an image of a gel.

The cDNA template is then amplified in the quantitative step, during which the fluorescence emitted by labeled hybridization probes or intercalating dyes changes as the DNA amplification process progresses.

[128] Alternatively, "tag based" technologies like Serial analysis of gene expression (SAGE) and RNA-Seq, which can provide a relative measure of the cellular concentration of different mRNAs, can be used.

Although NGS is comparatively time-consuming, expensive, and resource-intensive, it can identify single-nucleotide polymorphisms, splice-variants, and novel genes, and can also be used to profile expression in organisms for which little or no sequence information is available.

For genes encoding proteins, the expression level can be directly assessed by a number of methods with some clear analogies to the techniques for mRNA quantification.

The gel-based nature of this assay makes quantification less accurate, but it has the advantage of being able to identify later modifications to the protein, for example proteolysis or ubiquitination, from changes in size.

By replacing the gene with a new version fused to a green fluorescent protein marker or similar, expression may be directly quantified in live cells.

Doxycycline is also used in "Tet-on" and "Tet-off" tetracycline controlled transcriptional activation to regulate transgene expression in organisms and cell cultures.

In addition to these biological tools, certain naturally observed configurations of DNA (genes, promoters, enhancers, repressors) and the associated machinery itself are referred to as an expression system.

There are several ways to construct gene expression networks, but one common approach is to compute a matrix of all pair-wise correlations of expression across conditions, time points, or individuals and convert the matrix (after thresholding at some cut-off value) into a graphical representation in which nodes represent genes, transcripts, or proteins and edges connecting these nodes represent the strength of association (see GeneNetwork GeneNetwork 2).

[137] The following experimental techniques are used to measure gene expression and are listed in roughly chronological order, starting with the older, more established technologies.

RNA polymerase moving along a stretch of DNA, leaving behind newly synthetized strand of RNA.
The process of transcription is carried out by RNA polymerase (RNAP), which uses DNA (black) as a template and produces RNA (blue).
Pre-mRNA is spliced to form of mature mRNA.
Illustration of exons and introns in pre-mRNA and the formation of mature mRNA by splicing. The UTRs (in green) are non-coding parts of exons at the ends of the mRNA.
Ribosome translating messenger RNA to chain of amino acids (protein).
During the translation, tRNA charged with amino acid enters the ribosome and aligns with the correct mRNA triplet. Ribosome then adds amino acid to growing protein chain.
Process of protein folding.
Protein before (left) and after (right) folding
A cat with patches of orange and black fur.
The patchy colours of a tortoiseshell cat are the result of different levels of expression of pigmentation genes in different areas of the skin .
When lactose is present in a prokaryote, it acts as an inducer and inactivates the repressor so that the genes for lactose metabolism can be transcribed.
Ribbon diagram of the lambda repressor dimer bound to DNA.
The lambda repressor transcription factor (green) binds as a dimer to major groove of DNA target (red and blue) and disables initiation of transcription. From PDB : 1LMB ​.
A cartoon representation of the nucleosome structure.
In eukaryotes, DNA is organized in form of nucleosomes . Note how the DNA (blue and green) is tightly wrapped around the protein core made of histone octamer (ribbon coils), restricting access to the DNA. From PDB : 1KX5 ​.
Regulation of transcription in mammals . An active enhancer regulatory region is enabled to interact with the promoter region of its target gene by formation of a chromosome loop. This can initiate messenger RNA (mRNA) synthesis by RNA polymerase II (RNAP II) bound to the promoter at the transcription start site of the gene. The loop is stabilized by one architectural protein anchored to the enhancer and one anchored to the promoter and these proteins are joined to form a dimer (red zigzags). Specific regulatory transcription factors bind to DNA sequence motifs on the enhancer. General transcription factors bind to the promoter. When a transcription factor is activated by a signal (here indicated as phosphorylation shown by a small red star on a transcription factor on the enhancer) the enhancer is activated and can now activate its target promoter. The active enhancer is transcribed on each strand of DNA in opposite directions by bound RNAP IIs. Mediator proteins (a complex consisting of about 26 proteins in an interacting structure) communicate regulatory signals from the enhancer DNA-bound transcription factors to the promoter.
DNA methylation is the addition of a methyl group to the DNA that happens at cytosine . The image shows a cytosine single ring base and a methyl group added on to the 5 carbon. In mammals, DNA methylation occurs almost exclusively at a cytosine that is followed by a guanine .
The identified areas of the human brain are involved in memory formation.
A chemical structure of neomycin molecule.
Neomycin is an example of a small molecule that reduces expression of all protein genes inevitably leading to cell death; it thus acts as an antibiotic .
Schematic karyogram of a human , showing an overview of the expression of the human genome using G banding , which is a method that includes Giemsa staining , wherein the lighter staining regions are generally more transcriptionally active, whereas darker regions are more inactive.
An RNA Expression diagram.
The RNA expression profile of the GLUT4 Transporter (one of the main glucose transporters found in the human body)
Visualization of hunchback mRNA in Drosophila embryo.
In situ-hybridization of Drosophila embryos at different developmental stages for the mRNA responsible for the expression of hunchback . High intensity of blue color marks places with high hunchback mRNA quantity.
A ribbon diagram of green fluorescent protein resembling barrel structure.
The three-dimensional structure of green fluorescent protein . The residues in the centre of the "barrel" are responsible for production of green light after exposing to higher energetic blue light. From PDB : 1EMA ​.
Tet-ON inducible shRNA system