9.3 Case study: Heapsort
We can use a heap to implement a very simple sorting algorithm:
- Insert all elements from the unsorted array into a min-heap.
- Remove each element in turn from the heap, putting it in its right place in the original array.
Since the heap returns the smallest elements first, they will be inserted in sorted order into the new array. Here is pseudocode:
function naiveHeapsort(arr):
= new MinHeap()
heap for i in 0 .. arr.size-1:
add(arr[i])
heap.for i in 0 .. arr.size-1:
= heap.removeMin() arr[i]
What is the time complexity of this algorithm?
- We have two very similar for loops, iterating over the range of the array.
- The body of each loop is logarithmic, so the total loop complexity is .
So, we have two loops with linearithmic complexity, and therefore the
algorithm is linearithmic too,
.
This means that naiveHeapsort
is an efficient sorting
algorithm. However, our algorithm has some disadvantages:
- It is not in-place: we have to allocate additional space for the heap.
- Also, we saw in the previous section that we can build the heap both faster and in-place.
- (That’s why we call it
naiveHeapsort
, and now we will find out how to solve both of these problems.)
9.3.1 In-place Heapsort
In Section 9.2.7
we saw that there is a more efficient way of turning an array into a
heap, by using the buildHeap
method. We can use this to
implement a faster and in-place version of Heapsort.
The crucial step here is to use a max-heap, which might seem
counter-intuitive. After building the heap we tweak the
removeMax
method a little, so that it keeps the
removed element, but puts it at the end of the array. This is
why we need a max-heap – because the first element we remove will be put
at the very end of the array. Here is an overview of the idea:
- First, turn the array into a max-heap, using
buildHeap
. - Remove the maximum element, and put it at the end of the array. Then decrease the size of the heap – this will make the maximum element to be outside of the heap.
- Now remove the second largest element, and put it at the second final place. And decrease the size of the heap.
- Then remove the third largest element, and so on.
- Finally, when the heap is exhausted the internal array is sorted.
Here is a visualisation of the Heapsort algorithm.
A complete implementation is as follows.
function heapsort(arr):
// First, convert the array to a max heap.
= new MaxHeap()
heap
heap.buildHeap(arr)
// Then, repeatedly remove each maximum element from the heap,
// and put it just after the heap.
= heap.size
n for size in n-1, n-2 .. 0:
0, size) // Put the max element at the end of the heap.
heap.swap(size = size // Change the heap size so it excludes the max element.
heap.0) // Now, sift the temporary root down. heap.siftDown(
Here is a warmup practice exercise for Heapsort.
Heapsort proficiency practice
Heapsort proficiency practice
Now test yourself to see how well you understand Heapsort. Can you reproduce its behaviour?
Analysis of in-place Heapsort
Here is an analysis of the time complexity of Heapsort:
- The first step in Heapsort is to heapify the array. This will cost running time for an array of size .
- Then it swaps each element with the root node and “sifts” it down
the heap.
- Sifting down is logarithmic, , in the worst case, since the height of the tree is logarithmic.
- And this is done for each of the elements in the array.
- So, the second loop is linearithmic, .
Therefore the worst-case complexity of Heapsort is linearithmic, .
This visualisation explains the running time analysis of
Heapsort.
While typically slower than Quicksort by a constant factor (because
unloading the heap using removeMax
is somewhat slower than
Quicksort’s series of partitions), Heapsort has one special advantage
over the other sorts studied so far. Building the heap is relatively
cheap, requiring
time. Removing the maximum-valued record from the heap requires
time in the worst case. Thus, if we wish to find the
records with the largest key values in an array, we can do so in time
.
If
is small, this is a substantial improvement over the time required to
find the
largest-valued records using one of the other sorting methods described
earlier (many of which would require sorting all of the array first).
One situation where we are able to take advantage of this concept is in
the implementation of Kruskal’s
algorithm for minimum
spanning trees. That algorithm requires that edges be visited in
ascending order (so, use a min-heap), but this process stops as soon as
the MST is complete. Thus, only a relatively small fraction of the edges
need be sorted.
Another special case arises when all of the records being sorted have the same key value. This represents the best case for Heapsort. This is because removing the smallest value requires only constant time, since the value swapped to the top is never pushed down the heap.