Linear Time Sorting – Bucket or Bin Sort
Assume that the keys of the items that we wish to sort lie in a small fixed range and that there is only one item with each value of the key. Then we can sort with the following procedure:
1. Set up an array of “bins” – one for each value of the key – in order,
2. Examine each item and use the value of the key to place it in the appropriate bin.
Now our collection is sorted and it only took n operations, so this is an O(n) operation. However, note that it will only work under very restricted conditions. To understand these restrictions, let’s be a little more precise about the specification of the problem and assume that there are m values of the key. To recover our sorted collection, we need to examine each bin. This adds a third step to the algorithm above,
3. Examine each bin to see whether there’s an item in it.
which requires m operations. So the algorithm’s time becomes:
T (n) = c1n + c2m
and it is strictly O(n + m). If m ≤ n, this is clearly O(n). However if m >> n, then it is O(m). An implementation of bin sort might look like:
BUCEKTSORT( array A, int n, int M)
1 // Pre-condition: for 1 ≤ i ≤ n, 0 ≤ a[i] < M
2 // Mark all the bins empty
3 for i ← 1 to M
4 do bin[i] ← Empty
5 for i ← 1 to n
6 do bin[A[i]] ← A[i]
If there are duplicates, then each bin can be replaced by a linked list. The third step then becomes:
3. Link all the lists into one list.
We can add an item to a linked list in O(1) time. There are n items requiring O(n) time. Linking a list to another list simply involves making the tail of one list point to the other, so it is O(1). Linking m such lists obviously takes O(m) time, so the algorithm is still O(n + m).