-dimensional hypercube is a network topology for parallel computers with
The topology allows for an efficient implementation of some basic communication primitives such as Broadcast, All-Reduce, and Prefix sum.
The algorithms described in this page utilize this structure efficiently.
Most of the communication primitives presented in this article share a common template.
The following pseudo code sketches the communication steps necessary.
Hereby, Initialization, Operation, and Output are placeholders that depend on the given communication primitive (see next section).
Each processing element iterates over its neighbors (the expression
's binary representation, therefore obtaining the numbers of its neighbors).
The processing operation depends on the communication primitive.
In the beginning of a prefix sum operation, each processing element
The pseudo code stores the prefix sum in variable
All-gather operations start with each processing element having a message
With each iteration, the transferred message doubles in length.
So it is a Reduce operation, where all processing units know the result.
Compared to a normal reduce operation followed by a broadcast, All-Reduce in hypercubes reduces the number of communication steps.
With each iteration a messages comes closer to its destination by one dimension, if it hasn't arrived yet.
In every following step, the sub cube is only half the size as before, but in the previous step exactly the same number of messages arrived from another processing element.
The ESBT-broadcast (Edge-disjoint Spanning Binomial Tree) algorithm[3] is a pipelined broadcast algorithm with optimal runtime for clusters with hypercube network topology.
edge-disjoint binomial trees in the hypercube, such that each neighbor of processing element
chunks of equal size and cyclically sends them to the roots of the binomial trees.
Upon receiving a chunk, the binomial trees broadcast it.
Broadcasting the chunk within the binomial tree takes
steps until the last binomial tree broadcast has finished, resulting in
This section describes how to construct the binomial trees systematically.
First, construct a single binomial spanning tree von
Then the children of each nodes are obtained by negating single leading zeroes.
This results in a single binomial spanning tree.
edge-disjoint copies of the tree, translate and rotate the nodes: for the
-th copy of the tree, apply a XOR operation with
The resulting binomial trees are edge-disjoint and therefore fulfill the requirements for the ESBT-broadcasting algorithm.