Heapsort

In computer science, heapsort is an efficient, comparison-based sorting algorithm that reorganizes an input array into a heap (a data structure where each node is greater than its children) and then repeatedly removes the largest node from that heap, placing it at the end of the array.

[3] Although somewhat slower in practice on most machines than a well-implemented quicksort, it has the advantages of very simple implementation and a more favorable worst-case O(n log n) runtime.

[5] In the same year, Robert W. Floyd published an improved version that could sort an array in-place, continuing his earlier research into the treesort algorithm.

There are three cases: The number of iterations in any one siftdown() call is bounded by the height of the tree, which is ⌊log2 n⌋ = O(log n).

Starting with element n/2 and working backwards, each internal node is made the root of a valid heap by sifting down.

[6][7] Although it is convenient to think of the two phases separately, most implementations combine the two, allowing a single instance of siftDownto be expanded inline.

The description above uses Floyd's improved heap-construction algorithm, which operates in O(n) time and uses the same siftDown primitive as the heap-extraction phase.

Being at the end of the array, the new element is a leaf and has no children to worry about, but may violate the heap property by being greater than its parent.

In contrast, in Williams' algorithm most of the calls to siftUp are made on large heaps of height O(log n).

Because it is dominated by the second heap-extraction phase, the heapsort algorithm itself has O(n log n) time complexity using either version of heapify.

[11] If comparisons are cheap (e.g. integer keys) then the difference is unimportant,[12] as top-down heapsort compares values that have already been loaded from memory.

[14] In top-down heapsort, each step of siftDown requires two comparisons, to find the minimum of three elements: the new node and its two children.

Bottom-up heapsort conceptually replaces the root with a value of −∞ and sifts it down using only one comparison per level (since no child can possibly be less than −∞) until the leaves are reached, then replaces the −∞ with the correct value and sifts it up (again, using one comparison per level) until the correct position is found.

[11] A naïve implementation of this conceptual algorithm would cause some redundant data copying, as the sift-up portion undoes part of the sifting down.

Finally, the upward traversal continues to the root's starting position, performing no more comparisons but exchanging nodes to complete the necessary rearrangement.

[15] The return value of the leafSearch is used in the modified siftDown routine:[11] Bottom-up heapsort was announced as beating quicksort (with median-of-three pivot selection) on arrays of size ≥16000.

[17] Heapsort primarily competes with quicksort, another very efficient general purpose in-place unstable comparison-based sort algorithm.

Heapsort's primary advantages are its simple, non-recursive code, minimal auxiliary storage requirement, and reliably good performance: its best and worst cases are within a small constant factor of each other, and of the theoretical lower bound on comparison sorts.

Heapsort's primary disadvantages are its poor locality of reference and its inherently serial nature; the accesses to the implicit tree are widely scattered and mostly random, and there is no straightforward way to convert it to a parallel algorithm.

The worst-case performance guarantees make heapsort popular in real-time computing, and systems concerned with maliciously chosen inputs[30] such as the Linux kernel.

With additional effort, quicksort can also be implemented in mostly branch-free code,[33][34] and multiple CPUs can be used to sort subpartitions in parallel.

An example of heapsort using Williams' heap-construction algorithm.