Flashsort

In particular, the goal is for the buckets to be of approximately equal size (n/m elements each),[1] with the ideal being division into m quantiles.

While the basic algorithm is a linear interpolation sort, if the input distribution is known to be non-uniform, a non-linear division will more closely approximate this ideal.

What distinguishes flash sort is step 5: an efficient O(n) in-place algorithm for collecting the elements of each bucket together in the correct relative order using only m words of additional memory.

The clever part is the elimination of one of those variables, allowing twice as many buckets to be used and therefore half as much time spent on the final O(n2) sorting.

Although the preceding description uses K to find the cycle leaders, it is in fact possible to do without it, allowing the entire m-word array to be eliminated.

(After the distribution is complete, the bucket boundaries can be found in L.) Suppose that we have classified all elements up to i−1, and are considering Ai as a potential new cycle leader.

Then proceed:[1][2] While saving memory, Flashsort has the disadvantage that it recomputes the bucket for many already-classified elements.

A small further reduction can be achieved by maintaining a count of unclassified elements and stopping when it reaches zero.

The only extra memory requirements are the auxiliary vector L for storing bucket bounds and the constant number of other variables used.

In the worst-case scenarios where almost all the elements are in a few buckets, the complexity of the algorithm is limited by the performance of the final bucket-sorting method, so degrades to O(n2).

Variations of the algorithm improve worst-case performance by using better-performing sorts such as quicksort or recursive flashsort on buckets which exceed a certain size limit.