Gene nomenclature

[2] Several other genus-specific research communities (e.g., Drosophila fruit flies, Mus mice) have adopted nomenclature standards, as well, and have published them on the relevant model organism websites and in scientific journals, including the Trends in Genetics Genetic Nomenclature Guide.

Regarding the first duality (same symbol and name for gene or protein), the context usually makes the sense clear to scientific readers, and the nomenclatural systems also provide for some specificity by using italic for a symbol when the gene is meant and plain (roman) for when the protein is meant.

[citation needed] Also owing to the nature of how scientific knowledge has unfolded, proteins and their corresponding genes often have several names and symbols that are synonymous.

Some older names and symbols live on simply because they have been widely used in the scientific literature (including before the newer ones were coined) and are well established among users.

For some nonhuman species, model organism databases serve as central repositories of guidelines and help resources, including advice from curators and nomenclature committees.

Standards were proposed in 1966 by Demerec et al.[8] Each bacterial gene is denoted by a mnemonic of three lower case letters which indicate the pathway or process in which the gene-product is involved, followed by a capital letter signifying the actual gene.

[9] Since being designated, some y-genes have been confirmed to have a function,[10] and assigned a synonym (alternative) name in recognition of this.

[10] Loss of gene activity leads to a nutritional requirement (auxotrophy) not exhibited by the wildtype (prototrophy).

There are additional superscripts and subscripts which provide more information about the mutation: Other modifiers: When referring to the genotype (the gene) the mnemonic is italicized and not capitalised.

[11] The research communities of vertebrate model organisms have adopted guidelines whereby genes in these species are given, whenever possible, the same names as their human orthologs.

For example, the symbol for the gene v-akt murine thymoma viral oncogene homolog 1, which is AKT1, cannot be said to be an acronym for the name, and neither can any of its various synonyms, which include AKT, PKB, PRKBA, and RAC.

In this sense they are similar to the symbols for units of measurement in the SI system (such as km for the kilometre), in that they can be viewed as true logograms rather than just abbreviations.

All human gene names and symbols can be searched online at the HGNC[13] website, and the guidelines for their formation are available there.

Human gene symbols generally are italicised, with all letters in uppercase (e.g., SHH, for sonic hedgehog).

[20] A nearly universal rule in copyediting of articles for medical journals and other health science publications is that abbreviations and acronyms must be expanded at first use, to provide a glossing type of explanation.

Nevertheless, gene and protein symbols "look just like" abbreviations and acronyms, which presents the problem that "failing" to "expand" them (even though it is not actually a failure and there are no true expansions) creates the appearance of violating the spell-out-all-acronyms rule.

One common way of reconciling these two opposing forces is simply to exempt all gene and protein symbols from the glossing rule.

"[22] Because copyeditors are not expected or allowed to rewrite the gene and protein nomenclature throughout a manuscript (except by rare express instructions on particular assignments), the middle ground in manuscripts using synonyms or older symbols is that the copyeditor will add a mention of the current official symbol at least as a parenthetical gloss at the first mention of the gene or protein, and query for confirmation.

Some basic conventions, such as (1) that animal/human homolog (ortholog) pairs differ in letter case (title case and all caps, respectively) and (2) that the symbol is italicized when referring to the gene but nonitalic when referring to the protein, are often not followed by contributors to medical journals.

Many journals have the copyeditors restyle the casing and formatting to the extent feasible, although in complex genetics discussions only subject-matter experts (SMEs) can effortlessly parse them all.

This seems confusing on the surface, although it is easier to understand when explained as follows: in this gene's case, as in many others, the alias (description) "happens to use the same letter string" that the symbol uses.

But the end result of all these factors is that the published literature often does not follow the nomenclature guidelines completely.