k-way merge algorithm

If the arrays are merged in arbitrary order, then the resulting running time is only O(kn).

It is clear that this minimizes the running time and can therefore not be worse than the strategy described in the previous paragraph.

The strategy explained in the previous paragraph needs Θ(n log k) running time, while the improved one only needs Θ(n + k log k) running time.

This straightforward implementation results in a running time of Θ(kn).

A heap uses approximately 2*log(k) comparisons in each step because it handles the tree from the root down to the bottom and needs to compare both children of each node.

The idea is to maintain a min-heap of the k lists, each keyed by their smallest current element.

A simple algorithm builds an output buffer with nodes from the heap.

Repeat until there is only one node left in the heap, at which point just append that remaining list (head and tail) to the output buffer.

Initially these pointers point to the smallest elements of the input array.

In an O(k) preprocessing step the heap is created using the standard heapify procedure.

Afterwards, the algorithm iteratively transfers the element that the root pointer points to, increases this pointer and executes the standard decrease key procedure upon the root element.

The running time of the increase key procedure is bounded by O(log k).

Note that the operation of replacing the key and iteratively doing decrease-key or sift-down are not supported by many Priority Queue libraries such as C++ stl and Java.

The list is sorted in ascending order, so the winner of a game is the smaller one of both elements.

For k-way merging, it is more efficient to only store the loser of each game (see image).

When building the tree or replacing an element with the next one from its list, we still promote the winner of the game to the top.

The tree is filled like in a sports match but the nodes only store the loser.

When using a loser tree, the partner for replaying the games is already stored in the nodes.

The loser of each replayed game is written to the node and the winner is iteratively promoted to the top.

In the following pseudocode, an object oriented tree is used instead of an array because it is easier to understand.

In each step of merging, only the games on the path from the new element to the root need to be replayed.

As the tree is balanced, the path from one of the input arrays to the root contains only Θ(log k) elements.

The games on the way to the top are replayed like in the previous section about replacement selection.

One can show that no comparison-based k-way merge algorithm exists with a running time in O(n f(k)) where f grows asymptotically slower than a logarithm, and n being the total number of elements.

This is a contradiction to the well-known result that no comparison-based sorting algorithm with a worst case running time below O(n log n) exists.

External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM) and instead they must reside in the slower external memory (usually a hard drive).

This reduction of merge passes is especially important considering the large amount of information that is usually being sorted in the first place, allowing for greater speed-ups while also reducing the amount of accesses to slower storage.

Tournament tree
Loser tree
Example for replacement selection
Visualization for the whole algorithm