In topological data analysis, a persistence barcode, sometimes shortened to barcode, is an algebraic invariant associated with a filtered chain complex or a persistence module that characterizes the stability of topological features throughout a growing family of spaces.
[1] Formally, a persistence barcode consists of a multiset of intervals in the extended real line, where the length of each interval corresponds to the lifetime of a topological feature in a filtration, usually built on a point cloud, a graph, a function, or, more generally, a simplicial complex or a chain complex.
Generally, longer intervals in a barcode correspond to more robust features, whereas shorter intervals are more likely to be noise in the data.
A persistence barcode is a complete invariant that captures all the topological information in a filtration.
[2] In algebraic topology, the persistence barcodes were first introduced by Sergey Barannikov in 1994 as the "canonical forms" invariants[2] consisting of a multiset of line segments with ends on two parallel lines, and later, in geometry processing, by Gunnar Carlsson et al. in 2004.
be a fixed field.
Consider a real-valued function on a chain complex
compatible with the differential, so that
the sublevel set
is a subcomplex of K, and the values of
define a filtration (which is in practice always finite): Then, the filtered complexes classification theorem states that for any filtered chain complex over
, there exists a linear transformation that preserves the filtration and brings the filtered complex into so called canonical form, a canonically defined direct sum of filtered complexes of two types: two-dimensional complexes with trivial homology
and one-dimensional complexes with trivial differential
describing the canonical form, is called the barcode, and it is the complete invariant of the filtered chain complex.
The concept of a persistence module is intimately linked to the notion of a filtered chain complex.
A persistence module
consists of a family of
[4] This construction is not specific to
; indeed, it works identically with any totally-ordered set.
A persistence module
is said to be of finite type if it contains a finite number of unique finite-dimensional vector spaces.
The latter condition is sometimes referred to as pointwise finite-dimensional.
Define a persistence module
, where the linear maps are the identity map inside the interval.
is sometimes referred to as an interval module.
-indexed persistence module
of finite type, there exists a multiset
, where the direct sum of persistence modules is carried out index-wise.
, and it is unique up to a reordering of the intervals.
[3] This result was extended to the case of pointwise finite-dimensional persistence modules indexed over an arbitrary totally-ordered set by William Crawley-Boevey and Magnus Botnan in 2020,[7] building upon known results from the structure theorem for finitely generated modules over a PID, as well as the work of Cary Webb for the case of the integers.