Merge-insertion sort

In computer science, merge-insertion sort or the Ford–Johnson algorithm is a comparison sorting algorithm published in 1959 by L. R. Ford Jr. and Selmer M.

[1][2][3][4] It uses fewer comparisons in the worst case than the best previously known algorithms, binary insertion sort and merge sort,[1] and for 20 years it was the sorting algorithm with the fewest known comparisons.

[5] Although not of practical significance, it remains of theoretical interest in connection with the problem of sorting with a minimum number of comparisons.

[3] The same algorithm may have also been independently discovered by Stanisław Trybuła and Czen Ping.

[4] Merge-insertion sort performs the following steps, on an input

elements:[6] The algorithm is designed to take advantage of the fact that the binary searches used to insert elements into

are most efficient (from the point of view of worst case analysis) when the length of the subsequence that is searched is one less than a power of two.

This is because, for those lengths, all outcomes of the search use the same number of comparisons as each other.

[1] To choose an insertion ordering that produces these lengths, consider the sorted sequence

after step 4 of the outline above (before inserting the remaining elements), and let

is odd, the remaining unpaired element should also be numbered as

larger than the indexes of the paired elements.

denote the number of comparisons that merge-insertion sort makes, in the worst case, when sorting

This number of comparisons can be broken down as the sum of three terms: In the third term, the worst-case number of comparisons for the elements in the first group is two, because each is inserted into a subsequence of

is inserted into the three-element subsequence

is inserted into some permutation of the three-element subsequence

, or in some cases into the two-element subsequence

of the second group are each inserted into a subsequence of length at most seven, using three comparisons.

More generally, the worst-case number of comparisons for the elements in the

[1][2][3][4] By summing the number of comparisons used for all the elements and solving the resulting recurrence relation, this analysis can be used to compute the values of

, giving the formula[7] or, in closed form,[8] For

the numbers of comparisons are[1] The algorithm is called merge-insertion sort because the initial comparisons that it performs before its recursive call (pairing up arbitrary items and comparing each pair) are the same as the initial comparisons of merge sort, while the comparisons that it performs after the recursive call (using binary search to insert elements one by one into a sorted list) follow the same principle as insertion sort.

In this sense, it is a hybrid algorithm that combines both merge sort and insertion sort.

However, for larger inputs the number of comparisons made by the merge-insertion algorithm is bigger than this lower bound.

Merge-insertion sort also performs fewer comparisons than the sorting numbers, which count the comparisons made by binary insertion sort or merge sort in the worst case.

, with the same leading term but a worse constant factor in the lower-order linear term.

[10][11] For 20 years, merge-insertion sort was the sorting algorithm with the fewest comparisons known for all input lengths.

However, in 1979 Glenn Manacher published another sorting algorithm that used even fewer comparisons, for large enough inputs.

[3][5] It remains unknown exactly how many comparisons are needed for sorting, for all

, but Manacher's algorithm and later record-breaking sorting algorithms have all used modifications of the merge-insertion sort ideas.

An animation of the merge-algorithm sorting an array of randomized values.