Initially developed by the International Union of Pure and Applied Chemistry (IUPAC) and National Institute of Standards and Technology (NIST) from 2000 to 2005, the format and algorithms are non-proprietary.
The InChI algorithm converts input structural information into a unique InChI identifier in a three-step process: normalization (to remove redundant information), canonicalization (to generate a unique number label for each atom), and serialization (to give a string of characters).
InChIs differ from the widely used CAS registry numbers in three respects: firstly, they are freely usable and non-proprietary; secondly, they can be computed from structural information and do not have to be assigned by some organization; and thirdly, most of the information in an InChI is human readable (with practice).
The InChIKey specification was released in September 2007 in order to facilitate web searches for chemical compounds, since these were problematic with the full-length InChI.
The continuing development of the standard has been supported since 2010 by the not-for-profit InChI Trust, of which IUPAC is a member.
[10] The current software version is 1.07.1 (August 2024), uses the MIT license, and may be downloaded from the InChI GitHub site.
This may involve changing bond orders, rearranging formal charges and possibly adding and removing protons.
One way this can happen is that all metal atoms are disconnected during normalization; so, for example, the InChI for tetraethyllead will have five components, one for lead and four for the ethyl groups.
Finally, a nonstandard reconnected /r layer can be added, which effectively gives a new InChI generated without breaking bonds to metal atoms.
The condensed, 27 character InChIKey is a hashed version of the full InChI (using the SHA-256 algorithm), designed to allow for easy web searches of chemical compounds.
[12] The InChIKey currently consists of three parts separated by hyphens, of 14, 10 and one character(s), respectively, like XXXXXXXXXXXXXX-YYYYYYYYFV-P.
Current extensions are being defined to handle polymers and mixtures, Markush structures, reactions[16] and organometallics, and once accepted by the Division VIII Subcommittee will be added to the algorithm.