HH-suite

HHsearch and HHblits are two main programs in the package and the entry point to its search function, the latter being a faster iteration.

Among the most popular methods for protein sequence matching, the programs have been cited more than 5000 times total according to Google Scholar.

Many proteins have been investigated in model organisms such as many bacteria, baker's yeast, fruit flies, zebra fish or mice, for which experiments can be often done more easily than with human cells.

The profiles are derived from multiple sequence alignments (MSAs), in which related proteins are written together (aligned), such that the frequencies of amino acids in each position can be interpreted as probabilities for amino acids in new related proteins, and be used to derive the "similarity scores".

The output of HHpred and HHsearch is a ranked list of database matches (including E-values and probabilities for a true relationship) and the pairwise query-database sequence alignments.

As in PSI-BLAST, it works iteratively, repeatedly constructing new query profiles by adding the results found in the previous round.

Its prefiltering reduces the tens of millions HMMs to match against to a few thousands of them, thus speeding up the slow HMM-HMM comparison process.

[12] In CASP8, HHpred was ranked 7th on all targets and 2nd on the subset of single domain proteins, while still being more than 50 times faster than the top-ranked servers.

[4] In addition to HHsearch and HHblits, the HH-suite contains programs and perl scripts for format conversion, filtering of MSAs, generation of profile HMMs, the addition of secondary structure predictions to MSAs, the extraction of alignments from program output, and the generation of customized databases.

The HMM-HMM alignment algorithm of HHblits and HHsearch was significantly accelerated using vector instructions in version 3 of the HH-suite.

Iterative sequence search scheme of HHblits