CATH database

CATH shares many broad features with the SCOP resource, however there are also many areas in which the detailed classification differs greatly.

Protein domains are identified within these chains using a mixture of automatic methods and manual curation.

[7] The domains are then classified within the CATH structural hierarchy: at the Class (C) level, domains are assigned according to their secondary structure content, i.e. all alpha, all beta, a mixture of alpha and beta, or little secondary structure; at the Architecture (A) level, information on the secondary structure arrangement in three-dimensional space is used for assignment; at the Topology/fold (T) level, information on how the secondary structure elements are connected and arranged is used; assignments are made to the Homologous superfamily (H) level if there is good evidence that the domains are related by evolution[2] i.e. they are homologous.

Additional sequence data for domains with no experimentally determined structures are provided by CATH's sister resource, Gene3D, which are used to populate the homologous superfamilies.

The latest release of CATH-Gene3D (v4.3) was released in December 2020 and consists of:[8] CATH is an open source software project, with developers developing and maintaining a number of open-source tools,[9] which are available publicly on GitHub.