Clustal

Clustal has been an important bioinformatic software, with two of its academic publications amongst the top 100 papers cited of all time, according to Nature in 2014.

This program accepts a wide range of input formats, including NBRF/PIR, FASTA, EMBL/Swiss-Prot, Clustal, GCC/MSF, GCG9 RSF, and GDE.

ClustalV was released 4 years later and greatly improved upon the original software, adding and altering few key features.

Both versions use the same fast approximate algorithm to calculate the similarity scores between sequences, which in turn produces the pairwise alignments.

The algorithm works by calculating the similarity scores as the number of k-tuple matches between two sequences, accounting for a set penalty for gaps.

[16] Some of the most notable additions in ClustalV are profile alignments, and full command line interface options.

[15] The option to run from the command line expedites the multiple sequence alignment process.

When the program is completed, the output of the multiple sequence alignment as well as the dendrogram go to files with .aln and .dnd extensions respectively.

[16] ClustalW, like other Clustal versions, is used for aligning multiple nucleotide or protein sequences efficiently.

[citation needed] This program requires three or more sequences in order to calculate a global alignment.

When multiple sequence alignment algorithms were compared in 2014, ClustalW was one of the fastest that was able to produce results at the desired level of accuracy.

It uses seeded guide trees and a new HMM engine that focuses on two profiles to generate these alignments.

as one of the fastest online implementations of all multiple sequence alignment tools and still ranks high in accuracy, among both consistency-based and matrix-based algorithms.

Clustal Omega has five main steps in order to generate the multiple sequence alignment.

The speed and accuracy of the guide trees in Clustal Omega is attributed to the implementation of a modified mBed algorithm.

[example needed] On extremely large datasets with hundreds of thousands of input sequences, Clustal Omega outperforms all other algorithms in time, memory, and accuracy of results.

Clustal Omega uses the HHAlign package of the HH-Suite, which aligns two profile Hidden Markov Models instead of a profile-profile comparison.

On data sets with non-conserved terminal bases, Clustal Omega can be more accurate than Probcons or T-Coffee, despite the fact that both are consistency-based algorithms.

On an efficiency test with programs that produce high accuracy scores, MAFFT was the fastest, closely followed by Clustal Omega.

Both downloads come pre-compiled for many operating systems like Linux, Mac OS X and Windows (both XP and Vista).

This release was designed to make the website more organized and user friendly, as well as updating the source codes to their most recent versions.

Multiple sequence alignment of CDK4 protein generated with ClustalW. Arrows indicate point mutations .
Depicts the steps the ClustalW software algorithm uses for global alignments
Diagram showing neighbor-joining method in sequence alignment for bioinformatics
Flowchart depicting the step-by-step algorithm used in Clustal Omega.
The structure of a profile HMM used in the implementation of Clustal Omega is shown here.