Given an integer array, sort it using the heapsort algorithm in C, C++, Java, and Python.

Heapsort Overview

Heapsort is an in-place, comparison-based sorting algorithm and can be thought of as an improved selection sort as it divides the input into a sorted and an unsorted region. It iteratively shrinks the unsorted region by extracting the largest/smallest element and moving that to the sorted region. The improvement consists of using a heap data structure rather than a linear-time search to find the maximum. Heapsort does not produce a stable sort, which means that the implementation does not preserve the input order of equal elements in the sorted output. It is generally slower than other O(n.log(n)) sorting algorithms like quicksort, merge sort.

The heapsort algorithm can be divided into two parts:

  • In the first step, a heap is built out of the input data. We can do this in O(n) time.
  • In the second step, a sorted array is created by repeatedly removing the largest/smallest element from the heap (the root of the heap) and inserting it into the array. The heap is updated (heapify-down is called on the root node) after each removal to maintain the heap property. Once all elements have been removed from the heap, we get a sorted array. This is done in O(n.log(n)) time since we need to pop n elements, where each removal involves a constant amount of work and a single heapify operation takes O(log(n)) time.

How to build a heap?

A naive solution would be to start with an empty heap and repeatedly insert each element of the input list into it. The problem with this approach is that it runs in O(n.log(n)) time on an input containing n elements, as it performs n insertions at O(log(n)) cost each.

We can build a heap in O(n) time by arbitrarily putting the input elements into a heap array. Then starting from the last internal node of the heap (present at index (n-2)/2 in the array), call the heapify procedure on each node up to the root node (till index 0). As the heapify procedure expects a node’s left and right child to be heaps, start from the last internal node (as leaf nodes are already heaps) and move up level by level.

One can argue that the basic heap operation of heapify runs in O(log(n)) time, and we call heapify roughly n/2 times (one for each internal node). So, the time complexity of the above solution is O(n.log(n)). However, it turns out that the analysis is not tight.

When heapify is called, the running time depends on how far an element might move down in the tree before the process terminates. In other words, it depends on the height of the element in a heap. In the worst-case, the element might go down to the leaf level. Let’s count the work done level by level.

  • There are 2h nodes at the bottom-most level, but we do not call heapify on any of these, so the work is 0.
  • There are 2(h-1) nodes at the next level, and each might move down by 1 level.
  • At the 3rd level from the bottom, there are 2(h-2) nodes, and each might move down by 2 levels and do on…

As we can see, not all heapify operations are O(log(n)), and this is why, by doing tight analysis, we might end up getting O(n) time.

Practice this algorithm

1. In-place Heapsort Implementation

Heapsort can be performed in place. We can do this by

  • Using a max-heap instead of min-heap (to sort in ascending order),
  • Using an input array for constructing heap (instead of using heap own storage)

The idea is to split the array into two parts – the heap and the sorted array. As each pop operation free space to the end of the array in a binary heap, move the popped item to the free space. So, the first popped item (maximum element) will go to the last position in the array, the second popped item (next maximum element) will go to the second last position in the array, and so on… finally, when all items are popped from the heap, we will get an array sorted in ascending order.

C


Download  Run Code

C++


Download  Run Code

Output:

-2 1 4 6 7 9

Java


Download  Run Code

Output:

[-2, 1, 4, 6, 7, 9]

Python


Download  Run Code

Output:

[-2, 1, 4, 6, 7, 9]

The time complexity of the above algorithm is O(n.log(n)), where n is the input size and requires O(n) implicit space for the call stack.

2. Out-of-place Heapsort Implementation

The out-of-place heapsort algorithm can be implemented as follows in C++ and Java:

C++


Download  Run Code

Output:

-2 1 4 6 7 9

Java


Download  Run Code

Output:

[-2, 1, 4, 6, 7, 9]

The time complexity of the above algorithm is O(n.log(n)), and the auxiliary space used by the program is O(n), where n is the size of the input.

 
Exercise: Sort elements in descending order using heapsort

 
References:

1. https://en.wikipedia.org/wiki/Heapsort

2. https://stackoverflow.com/questions/9755721/how-can-building-a-heap-be-on-time-complexity