Bitonic sorter

Bitonic mergesort is a parallel algorithm for sorting.

It is also used as a construction method for building a sorting network.

[1] This makes it a popular choice for sorting large numbers of elements on an architecture which itself contains a large number of parallel execution units running in lockstep, such as a typical GPU.

It is evident from the construction algorithm that the number of rounds of parallel comparisons is given by

Although the absolute number of comparisons is typically higher than Batcher's odd-even sort, many of the consecutive operations in a bitonic sort retain a locality of reference, making implementations more cache-friendly and typically more efficient in practice.

The network is designed to sort the elements, with the largest number at the bottom.

This theorem is not obvious, but can be verified by carefully considering all the cases of how the various inputs might compare, using the zero-one principle, where a bitonic sequence is a sequence of 0s and 1s that contains no more than two "10" or "01" subsequences.

Every such box has the same structure: a red box is applied to the entire input sequence, then to each half of the result, then to each half of each of those results, and so on.

If the input to this box happens to be bitonic, then the output will be completely sorted in increasing order (blue) or decreasing order (green).

It will then pass through a smaller red box that sorts it into the correct quarter of the list within that half.

Therefore, the output of the green or blue box will be completely sorted.

The green and blue boxes combine to form the entire sorting network.

For any arbitrary sequence of inputs, it will sort them correctly, with the largest at the bottom.

The output of each green or blue box will be a sorted sequence, so the output of each pair of adjacent lists will be bitonic, because the top one is blue and the bottom one is green.

Each column of blue and green boxes takes N sorted sequences and concatenates them in pairs to form N/2 bitonic sequences, which are then sorted by the boxes in that column to form N/2 sorted sequences.

This process starts with each input considered to be a sorted list of one element, and continues through all the columns until the last merges them into a single, sorted list.

Because the last stage was blue, this final list will have the largest element at the bottom.

This would allow all the arrows to point the same direction, but would prevent the horizontal lines from being straight.

However, a similar crossover could be placed to the right of the bottom half of the outputs from any red block, and the sort would still work correctly, because the reverse of a bitonic sequence is still bitonic.

The orange blocks are equivalent to red blocks where the sequence order is reversed for the bottom half of its inputs and the bottom half of its outputs.

This is the most common representation of a bitonic sorting network.

Unlike the previous interpretation, because the elements remain logically ordered, it's easy to extend this representation to a non-power-of-two case (where each compare-and-swap ignores any case where the larger index is out of range).

The following is a recursion-free implementation of the bitonic mergesort when the array length is a power of two:[2]