Gene family

The expansion or contraction of gene families along a specific lineage can be due to chance, or can be the result of natural selection.

Recent work uses a combination of statistical models and algorithmic techniques to detect gene families that are under the effect of natural selection.

[3] The HUGO Gene Nomenclature Committee (HGNC) creates nomenclature schemes using a "stem" (or "root") symbol for members of a gene family (by homology or function), with a hierarchical numbering system to distinguish the individual members.

[6][8] Multigene families typically consist of members with similar sequences and functions, though a high degree of divergence (at the sequence and/or functional level) does not lead to the removal of a gene from a gene family.

Due to the similarity of their sequences and their overlapping functions, individual genes in the family often share regulatory control elements.

Such families allow for massive amounts of gene product to be expressed in a short time as needed.

The large number of members allows superfamilies to be widely dispersed with some genes clustered and some spread far apart.

The genes are diverse in sequence and function displaying various levels of expression and separate regulation controls.

[6] Duplications can occur within a lineage (e.g., humans might have two copies of a gene that is found only once in chimpanzees) or they are the result of speciation.

For example, a single gene in the ancestor of humans and chimpanzees now occurs in both species and can be thought of as having been 'duplicated' via speciation.

[8] Duplication occurs primarily through uneven crossing over events in meiosis of germ cells.

The protein transposase recognizes the outermost inverted repeats, cutting the DNA segment.

Any genes between the two transposable elements are relocated as the composite transposon jumps to a new area of the genome.

This new DNA copy of the mRNA is integrated into another part of the genome, resulting in gene family members being dispersed.

This protein aids in copying the RNA transcripts of LINEs and SINEs back into DNA, and integrates them into different areas of the genome.

Due to the highly repetitive nature of these elements, LINEs and SINEs when close together also trigger unequal crossing over events which result in single-gene duplications and the formation of gene families.

[6][8] Non-synonymous mutations resulting in the substitution of amino acids, increase in duplicate gene copies.

[10] Gene families, part of a hierarchy of information storage in a genome, play a large role in the evolution and diversity of multicellular organisms.

Contraction of gene families commonly results from accumulation of loss of function mutations.

Phylogenetic tree of the Mup gene family
Gene phylogeny as lines within grey species phylogeny. Top: An ancestral gene duplication produces two paralogs ( histone H1.1 and 1.2 ). A speciation event produces orthologs in the two daughter species (human and chimpanzee). Bottom: in a separate species ( E. coli ), a gene has a similar function ( histone-like nucleoid-structuring protein ) but has a separate evolutionary origin and so is an analog .