In applications such as in radix sort, a bound on the maximum key value k will be known in advance, and can be assumed to be part of the input to the algorithm.
[1][2][3] The relative order of items with equal keys is preserved here; i.e., this is a stable sort.
Because the algorithm uses only simple for loops, without recursion or subroutine calls, it is straightforward to analyze.
[1] For problem instances in which the maximum key value is significantly smaller than the number of items, counting sort can be highly space-efficient, as the only storage it uses other than its input and output arrays is the Count array which uses space O(k).
[5] If each item to be sorted is itself an integer, and used as key as well, then the second and third loops of counting sort can be combined; in the second loop, instead of computing the position where items with key i should be placed in the output, simply append Count[i] copies of the number i to the output.
Thus the keys are sorted and the duplicates are eliminated in this variant just by being placed into the bit array.
For data in which the maximum key size is significantly smaller than the number of data items, counting sort may be parallelized by splitting the input into subarrays of approximately equal size, processing each subarray in parallel to generate a separate count array for each subarray, and then merging the count arrays.
It is possible to modify the algorithm so that it places the items into sorted order within the same array that was given to it as the input, using only the count array as auxiliary storage; however, the modified in-place version of counting sort is not stable.