Topological deep learning

Traditional deep learning models, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), excel in processing data on regular grids and sequences.

However, scientific and real-world data often exhibit more intricate data domains encountered in scientific computations , including point clouds, meshes, time series, scalar fields graphs, or general topological spaces like simplicial complexes and CW complexes.

[7] TDL addresses this by incorporating topological concepts to process data with higher-order relationships, such as interactions among multiple entities and complex hierarchies.

This approach leverages structures like simplicial complexes and hypergraphs to capture global dependencies and qualitative spatial properties, offering a more nuanced representation of data.

Therefore, TDL can be generalized for data on differentiable manifolds, knots, links, tangles, curves, etc.

Traditional techniques from deep learning often operate under the assumption that a dataset is residing in a highly-structured space (like images, where convolutional neural networks exhibit outstanding performance over alternative methods) or a Euclidean space.

The prevalence of new types of data, in particular graphs, meshes, and molecules, resulted in the development of new techniques, culminating in the field of geometric deep learning, which originally proposed a signal-processing perspective for treating such data types.

[15] While originally confined to graphs, where connectivity is defined based on nodes and edges, follow-up work extended concepts to a larger variety of data types, including simplicial complexes[16][3] and CW complexes,[8][17] with recent work proposing a unified perspective of message-passing on general combinatorial complexes.

[18] While at first restricted to smaller datasets, subsequent work developed new descriptors that efficiently summarized topological information of datasets to make them available for traditional machine-learning techniques, such as support vector machines or random forests.

[22][23][24][25] Contemporary research in this field is largely concerned with either integrating information about the underlying data topology into existing deep-learning models or obtaining novel ways of training on topological domains.

One of the core concepts in topological deep learning is the domain upon which this data is defined and supported.

Next, we introduce the most common topological domains that are encountered in a deep learning setting.

Edges provide one way of defining relations among the entities of S. More specifically, edges in a graph allow one to define the notion of neighborhood using, for instance, the one hop neighborhood notion.

The idea of using relations that involve more than two entities is central to topological domains.

Such higher-order relations allow for a broader range of neighborhood functions to be defined on S to capture multi-way interactions among entities of S. Next we review the main properties, advantages, and disadvantages of some commonly studied topological domains in the context of deep learning, including (abstract) simplicial complexes, regular cell complexes, hypergraphs, and combinatorial complexes.

Cell and simplicial complexes are common examples of higher-order domains equipped with rank functions and therefore with hierarchies of relations.

Hypergraphs constitute examples of higher-order domains equipped with set-type relations.

[1] The learning tasks in TDL can be broadly classified into three categories:[1] In practice, to perform the aforementioned tasks, deep learning models designed for specific topological spaces must be constructed and implemented.

These models, known as topological neural networks, are tailored to operate effectively within these spaces.

[2][1] Unlike traditional neural networks tailored for grid-like structures, TNNs are adept at handling more intricate data representations, such as graphs, simplicial complexes, and cell complexes.

By harnessing the inherent topology of the data, TNNs can capture both local and global relationships, enabling nuanced analysis and interpretation.

This allows for a richer representation of spatial relationships compared to traditional graph-based message passing frameworks.

aggregates these messages, allowing information to be exchanged effectively between adjacent cells within the same neighborhood.

Fourth, Equation 4 specifies how the aggregated messages influence the state of a cell in the next layer.

For instance, Maggs et al.[26] leverage geometric information from embedded simplicial complexes, i.e., simplicial complexes with high-dimensional features attached to their vertices.This offers interpretability and geometric consistency without relying on message passing.

Motivated by the modular nature of deep neural networks, initial work in TDL drew inspiration from topological data analysis, and aimed to make the resulting descriptors amenable to integration into deep-learning models.

This led to work defining new layers for deep neural networks.

This was achieved by means of end-to-end-trainable projection functions, permitting topological features to be used to solve shape classification tasks, for instance.

Follow-up work expanded more on the theoretical properties of such descriptors and integrated them into the field of representation learning.

[33][34][35] TDL is rapidly finding new applications across different domains, including data compression,[36] enhancing the expressivity and predictive performance of graph neural networks,[16][17][33] action recognition,[37] and trajectory prediction.