Heap Sort in Excruciating Detail
5 Nov 2017
Data structures and algorithms get interesting when it comes down to implementing them. All the edge cases and important details that seem to get glossed over in the texts come to life. I decided to document the heap sort in excruciating detail, as I was implementing it in Go, so that I would have a record of the gory details. Perhaps others might find this useful too.
This is a maximum binary heap:
__11__ / \ 5 10 / \ / 3 4 8
Any node must be larger than its children.
Because of this, the root of the tree contains the largest node.
A binary tree can be represented in an array, as long as we don't use the first (0th) place in the array:
index 0 1 2 3 4 5 6 value 0 11 5 10 3 4 8 __11(1)__ / \ 5(2) 10(3) / \ / 3(4) 4(5) 8(6)
A parent's left child can be found by multiplying the parent's array index by two:
left_child_index = parent_index * 2
A parent's right child can be found by multiplying the parent's array index by two and adding one:
right_child_index = parent_index * 2 + 1
If a parent's child indexes past the last index of the array, that parent has no children.
Note that node 10 has only a left child. Attempting to access the right child would read outside the array, which C will let you do, and which most other languages will give you some sort of index violation error.
A child's parent can be found by dividing the child's array index by two. This implies integer division, so that, for instance, node 4, at index 5, does not calculate its parent to be at index index 2.5, but correctly at index 2.
parent_index = child_index / 2
If we need to insert a new value into the heap, we start off by doing something interesting: we put the new value one index past the last-used index in our backing array! (WHY we do this will be explained after we answer interesting questions about HOW we do this.)
index 0 1 2 3 4 5 6 7 value 0 11 5 10 3 4 8 NEW
In terms of our binary tree, this means we put the new value in the next available child node. That new node will be the lowest, left-most, valid place in the binary tree.
__11(1)__ / \ 5(2) 10(3) / \ / \ 3(4) 4(5) 8(6) NEW(7)
This means that if you are going to implement a max binary heap using an array, you'd better leave some room for growth! After all, reallocating memory for a new member is an O(N) operation, and nobody wants that. Also, if you are doing to allocate up front a larger array than you need (a good idea) you will need to keep track of how many items there are in your heap / the last used index of your array. This will make life easier.
Now for the why! If we put our new value at the first available new slot in the array (or bottom-most, left-most node, thinking of the binary tree representaton), it's as convenient a place as any to put that new value... to begin with.
Next, we swap the new value with its parent if the new value is larger than its parent. We do this again if the value is still larger than its new parent, and we repeat the process until the new value is in its correctplace.
It only takes O(log N) operations to find the new value's proper place in the heap. Not bad!
The core algorithm is fairly straightforward and looks like this (note that parent and child variables names should be thought of as parent INDEX and child INDEX, not the nodes themselves):
for parent := child / 2; parent > 0; parent = child / 2 { if h.data[child] < h.data[parent] { h.data[child], h.data[parent] = h.data[parent], h.data[child] } child = parent }
Note that we are assuming this type exists, and an instance of it is named h in the above code:
type MaxHeap struct { data []int capacity int size int }
HOWEVER in the real world, there is some code that needs to come before this: does the array need to grow before this new element is added? Presumably we are tracking the last valid index of the array; don't forget to increase that index by one!
A fuller implementation looks more like this (assume h is an instance of the MaxHeap struct from above):
func (h *MaxHeap) Insert(i int) error { if h.size+1 > h.capacity { newCapacity := h.capacity * 2 // if newCapacity became negative, we have exceeded // our capacity by doing one bit-shift too far if newCapacity < 0 { return errors.New("Capacity exceeded") } h.resize(newCapacity) } h.size++ h.data[h.size] = i child := h.size for parent := child / 2; parent > 0; parent = child / 2 { if h.data[child] > h.data[parent] { h.data[child], h.data[parent] = h.data[parent], h.data[child] } child = parent } return nil }
If we add the new value 12 to our max binary heap, here's how the swaps look, step by step:
__11(1)__ / \ 5(2) 10(3) / \ / \ 3(4) 4(5) 8(6) 12(7) index 0 1 2 3 4 5 6 7 value 0 11 5 10 3 4 8 12 __11(1)__ / \ 5(2) 12(3) / \ / \ 3(4) 4(5) 8(6) 10(7) index 0 1 2 3 4 5 6 7 value 0 11 5 12 3 4 8 10 __12(1)__ / \ 5(2) 11(3) / \ / \ 3(4) 4(5) 8(6) 10(7) index 0 1 2 3 4 5 6 7 value 0 12 5 11 3 4 8 10
When we want to remove the maximum value from our heap, the pattern is sort of similar and sort of the opposite of insert. We copy to a variable the top-most value to be returned to the user, and then we take the item at the end of the array (or, bottom-most, left-most node of the tree) and move it into the newly-vacated spot (root of the tree)!
Here's the illustration:
__12(1)__ / \ 5(2) 11(3) / \ / \ 3(4) 4(5) 8(6) 10(7) index 0 1 2 3 4 5 6 7 value 0 12 5 11 3 4 8 10 __12(1)__ --> 12 copied! / \ 5(2) 11(3) / \ / \ 3(4) 4(5) 8(6) 10(7) index 0 1 2 3 4 5 6 7 value 0 12 5 11 3 4 8 10
Last / bottom-most, left-most value is put at the root
__10(1)__ / \ 5(2) 11(3) / \ / \ 3(4) 4(5) 8(6) 10(7) index 0 1 2 3 4 5 6 7 value 0 10 5 11 3 4 8 10
Last valid index is decremented by one.
__10(1)__ / \ 5(2) 11(3) / \ / 3(4) 4(5) 8(6) index 0 1 2 3 4 5 6 value 0 10 5 11 3 4 8
Largest of root's two children is found, and if root is larger than the largest child, root is swapped with that child.
__11(1)__ / \ 5(2) 10(3) / \ / 3(4) 4(5) 8(6) index 0 1 2 3 4 5 6 value 0 10 5 11 3 4 8
Children of the node we just swapped down to are looked at; find largest child, swap if largest child is larger than parent; repeat until there are no children to look at.
The core idea is pretty simple, but the code gets a tad more complicated ensuring we do not read past the end of the array (particularly looking at a right child that might not exist) and of course being sure we decrement the last valid index of the array (here, called h.size). Note too that in this code, parent and child should be read as parentIndex and childIndex (still assuming an instance of MaxHeap named h, from above):
func (h *MaxHeap) Delete() (int, error) { if h.size == 0 { return 0, errors.New("Heap empty") } max := h.data[1] h.data[1] = h.data[h.size] h.size-- parent := 1 for parent*2 <= h.size { child := parent * 2 if child+1 <= h.size && h.data[child+1] > h.data[child] { child++ } if h.data[parent] < h.data[child] { h.data[child], h.data[parent] = h.data[parent], h.data[child] } else { break } parent = child } return max, nil }
Now that we can construct heaps, a natural next step is to provide a heap sort. The naive way to do this would be to create a heap, allocate a new array, and read the items out of the heap in order, populating the new array as we go. Nice and simple, and it requires O(2N) storage.
However, there is a way we can do this in O(N) storage if we don't mind sorting the data index of our MaxHeap instance above.
The first step is to take our Delete method and factor out the bubble-down or sink portion into its own method:
func (h *MaxHeap) Delete() (int, error) { if h.size == 0 { return 0, errors.New("Heap empty") } max := h.data[1] h.data[1] = h.data[h.size] h.size-- parent := 1 sink(h.data, parent, h.size) return max, nil } func sink(data []int, parent int, size int) { for parent*2 <= size { child := parent * 2 if child+1 <= size && data[child+1] > data[child] { child++ } if data[parent] < data[child] { data[child], data[parent] = data[parent], data[child] } else { break } parent = child } }
Then what we can do is use our new sink method on just parts of a larger array by beginning at a parent index and limiting ourselves to a certain size.
Here is our Sort function and its calls to our sink function, above.
func Sort(data []int) { if data == nil || len(data) <= 2 { return } size := len(data) - 1 for i := size / 2; i >= 1; i-- { sink(data, i, size) } for size > 1 { data[1], data[size] = data[size], data[1] size-- sink(data, 1, size) } }
The first for loop takes our array of (presumably) random numbers and turns them into a heap. The interesting thing is that unlike our heap Insert method, which puts each new item at the end of the array, and then percolates that value up to its correct position, this for loop takes values in the array and percolates them down to the bottom of the heap.
The for loop only has to start in the middle of the array, because if we are going to percolate values down, those values need to have children. And values more than half-way into the array (when thinking about the array as a binary tree) have no children.
So for our heap of size 10, we start at the 5th index and percolate that value down to where it belongs. Then we look at the 4th index and do the same thing, then the 3rd, etc.
Here it is in excruciating detail.
NOTE than when reading through this, the variable "size" is a bit unfortunately named; it would ideally be called "lastValidIndex".
sink first call, first loop; parent == 5, size == 10
______32(1)______ / \ ___16(2)___ 20(3) / \ / \ 26(4) 50(5) 19(6) 42(7) / \ / 11(8) 47(9) 29(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 16 20 26 50 19 42 11 47 29
50 has only one child, 29, so 29 is chosen as the largest child.
50 is greater than 29. No need to swap. Break.
50 and 29 are now a heap of size 2.
sink second call, first loop; parent == 4, size == 10
______32(1)______ / \ ___16(2)___ 20(3) / \ / \ 26(4) 50(5) 19(6) 42(7) / \ / 11(8) 47(9) 29(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 16 20 26 50 19 42 11 47 29
26 has two children, 11 and 47. 47 is picked as the largest child.
26 is less than 47, so swap.
______32(1)______ / \ ___16(2)___ 20(3) / \ / \ 47(4) 50(5) 19(6) 42(7) / \ / 11(8) 26(9) 29(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 16 20 47 50 19 42 11 26 29
sink second call, second loop; parent == 9, size == 10
parent * 2 (18) is not less than or equal to size; exit loop.47, 11, and 26 are now a heap of size 3. 50 and 29 are still a heap of size 2.
______32(1)______ / \ ___16(2)___ 20(3) / \ / \ 47(4) 50(5) 19(6) 42(7) / \ / 11(8) 26(9) 29(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 16 20 47 50 19 42 11 26 29
sink third call, first loop; parent == 3, size == 10
20 has two children, 19 and 42. 42 is picked as the largest child.
______32(1)______ / \ ___16(2)___ 20(3) / \ / \ 47(4) 50(5) 19(6) 42(7) / \ / 11(8) 26(9) 29(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 16 20 47 50 19 42 11 26 29
20 is less than 42, so swap.
______32(1)______ / \ ___16(2)___ 42(3) / \ / \ 47(4) 50(5) 19(6) 20(7) / \ / 11(8) 26(9) 29(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 16 42 47 50 19 20 11 26 29
sink third call, second loop; parent == 7, size == 10
parent * 2 (14) is not less than or equal to size; exit loop.
42, 19, and 20 are now a heap of size 3.
47, 11, and 26 are still a heap of size 3.
50 and 29 are still a heap of size 2.
sink fourth call, first loop; parent == 2, size == 10
16 has two children, 47 and 50. 50 is picked as the largest child.
______32(1)______ / \ ___16(2)___ 42(3) / \ / \ 47(4) 50(5) 19(6) 20(7) / \ / 11(8) 26(9) 29(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 16 42 47 50 19 20 11 26 29
16 is less than 50, so swap.
______32(1)______ / \ ___50(2)___ 42(3) / \ / \ 47(4) 16(5) 19(6) 20(7) / \ / 11(8) 26(9) 29(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 50 42 47 16 19 20 11 26 29
sink fourth call, second loop; parent == 5, size == 10
16 has only one child, 29, so 29 is chosen as the largest child.
16 is less than 29, so swap.
______32(1)______ / \ ___50(2)___ 42(3) / \ / \ 47(4) 29(5) 19(6) 20(7) / \ / 11(8) 26(9) 16(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 50 42 47 29 19 20 11 26 16
sink fourth call, third loop; parent == 10, size == 10
parent * 2 (20) is not less than or equal to size; exit loop.
50, 47, 29, 11, 26, and 16 are now a heap of size 6.
42, 19, and 20 are still a heap of size 3.
sink fifth call, first loop; parent == 1, size == 10
32 has two children, 50 and 42. 50 is picked as the largest child.
32 is less than 50, so swap.
______50(1)______ / \ ___32(2)___ 42(3) / \ / \ 47(4) 29(5) 19(6) 20(7) / \ / 11(8) 26(9) 16(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 50 32 42 47 29 19 20 11 26 16
sink fifth call, second loop; parent == 2, size == 10
32 has two children, 47 and 29. 47 is picked as the largest child.
32 is less than 47, so swap.
______50(1)______ / \ ___47(2)___ 42(3) / \ / \ 32(4) 29(5) 19(6) 20(7) / \ / 11(8) 26(9) 16(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 50 47 42 32 29 19 20 11 26 16
sink fifth call, third loop; parent == 4, size == 10
32 has two children, 11 and 26. 26 is picked as the largest child.
32 is greater than 26. No need to swap. Break.
No sixth call to sink, because parent would be 0, breaking the for loop.
We now have a maxheap!
The second for loop of the Sort method uses the array for two purposes. It uses the end of the array for the sorted elements and it uses the start of the array for the max binary heap.
The idea is that the largest item is popped from the max binary heap and put at the end of our sorted array. When we re-heapify the max binary heap, its new largest element is the second-largest element overall; and we can then pop that element and put it behind the maximum element of our sorted elements.
So we create our sorted elements in the order of largest to smallest, filling them in right to left. As we go, the max binary heap, at the other end of our overall array, is shrinking one element at a time, making room for our sorted elements.
Let's go through the second loop of the Sort method in excruciating detail to see how that works.
first loop, size == 10
swap element 1 with element 10
The largest element is now at the end of the array, but we need to re-heapify our heap of size 9
.______16(1)______ / \ ___47(2)___ 42(3) / \ / \ 32(4) 29(5) 19(6) 20(7) / \ / 11(8) 26(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 16 47 42 32 29 19 20 11 26 50
decrement size; size == 9
sink first call, first loop; parent == 1, size == 9
16 has two children, 47 and 42. 47 is picked as the largest child.
16 is less than 47, so swap.
______47(1)______ / \ ___16(2)___ 42(3) / \ / \ 32(4) 29(5) 19(6) 20(7) / \ / 11(8) 26(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 47 16 42 32 29 19 20 11 26 50
sink first call, second loop; parent == 2, size == 9
16 has two children, 32 and 29. 32 is picked as the largest child.
16 is less than 32, so swap.
______47(1)______ / \ ___32(2)___ 42(3) / \ / \ 16(4) 29(5) 19(6) 20(7) / \ / 11(8) 26(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 47 32 42 16 29 19 20 11 26 50
sink first call, third loop; parent == 4, size == 9
16 has two children, 11 and 26. 26 is picked as the largest child.
16 is less than 26, so swap.
______47(1)______ / \ ___32(2)___ 42(3) / \ / \ 26(4) 29(5) 19(6) 20(7) / \ / 11(8) 16(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 47 32 42 26 29 19 20 11 16 50
sink first call, fourth loop; parent == 9, size == 9
parent * 2 (18) is not less than or equal to size; exit loop.
swap element 1 with element 9
______16(1)______ / \ ___32(2)___ 42(3) / \ / \ 26(4) 29(5) 19(6) 20(7) / \ / 11(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 16 32 42 26 29 19 20 11 47 50
decrement size; size == 8
sink second call, first loop; parent == 1, size == 8
16 has two children, 32 and 42. 42 is picked as the largest child.
16 is less than 42, so swap.
______42(1)______ / \ ___32(2)___ 16(3) / \ / \ 26(4) 29(5) 19(6) 20(7) / \ / 11(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 42 32 16 26 29 19 20 11 47 50
sink second call, second loop; parent == 3, size == 8
16 has two children, 19 and 20. 20 is picked as the largest child.
16 is less than 20, so swap.
______42(1)______ / \ ___32(2)___ 20(3) / \ / \ 26(4) 29(5) 19(6) 16(7) / \ / 11(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 42 32 20 26 29 19 16 11 47 50
sink second call, third loop; parent == 7, size == 8
parent * 2 (14) is not less than or equal to size; exit loop.
swap element 1 with element 8
______11(1)______ / \ ___32(2)___ 20(3) / \ / \ 26(4) 29(5) 19(6) 16(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 11 32 20 26 29 19 16 42 47 50
decrement size; size == 7
sink third call, first loop; parent == 1, size == 7
11 has two children, 32 and 20. 32 is picked as the largest child.
11 is less than 32, so swap.
______32(1)______ / \ ___11(2)___ 20(3) / \ / \ 26(4) 29(5) 19(6) 16(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 11 20 26 29 19 16 42 47 50
sink third call, second loop; parent == 2, size == 7
11 has two children, 26 and 29. 29 is picked as the largest child.
11 is less than 29, so swap.
______32(1)______ / \ ___29(2)___ 20(3) / \ / \ 26(4) 11(5) 19(6) 16(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 32 29 20 26 11 19 16 42 47 50
sink third call, third loop; parent == 5, size == 7
parent * 2 (10) is not less than or equal to size; exit loop.
swap element 1 with element 7
______16(1)______ / \ ___29(2)___ 20(3) / \ / \ 26(4) 11(5) 19(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 16 29 20 26 11 19 32 42 47 50
decrement size; size == 6
sink fourth call, first loop; parent == 1, size == 6
16 has two children, 29 and 20. 29 is picked as the largest child.
16 is less than 29, so swap.
______29(1)______ / \ ___16(2)___ 20(3) / \ / \ 26(4) 11(5) 19(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 29 16 20 26 11 19 32 42 47 50
sink fourth call, second loop; parent == 2, size == 6
16 has two children, 26 and 11. 26 is picked as the largest child.
16 is less than 26, so swap.
______29(1)______ / \ ___26(2)___ 20(3) / \ / \ 16(4) 11(5) 19(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 29 26 20 16 11 19 32 42 47 50
sink fourth call, third loop; parent == 4, size == 6
parent * 2 (8) is not less than or equal to size; exit loop.
swap element 1 with element 6
______19(1)______ / \ ___26(2)___ 20(3) / \ / \ 16(4) 11(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 19 26 20 16 11 29 32 42 47 50
decrement size; size == 5
sink fifth call, first loop; parent == 1, size == 5
19 has two children, 26 and 20. 26 is picked as the largest child.
19 is less than 26, so swap.
______26(1)______ / \ ___19(2)___ 20(3) / \ / \ 16(4) 11(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 26 19 20 16 11 29 32 42 47 50
sink fifth call, second loop; parent == 2, size == 5
19 has two children, 16 and 11. 16 is picked as the largest child.
19 is greater than 16. No need to swap. Break.
swap element 1 with element 5
______11(1)______ / \ ___19(2)___ 20(3) / \ / \ 16(4) 26(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 11 19 20 16 26 29 32 42 47 50
decrement size; size == 4
sink sixth call, first loop; parent == 1, size == 4
11 has two children, 19 and 20. 19 is picked as the largest child.
11 is less than 20, so swap.
______20(1)______ / \ ___19(2)___ 11(3) / \ / \ 16(4) 26(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 20 19 11 16 26 29 32 42 47 50
sink sixth call, first loop; parent == 3, size == 4
parent * 2 (6) is not less than or equal to size; exit loop.
swap element 1 with element 4
______16(1)______ / \ ___19(2)___ 11(3) / \ / \ 20(4) 26(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 16 19 11 20 26 29 32 42 47 50
decrement size; size == 3
sink seventh call, first loop; parent == 1, size == 3
16 has two children, 19 and 11. 19 is picked as the largest child.
16 is less than 19, so swap.
______19(1)______ / \ ___16(2)___ 11(3) / \ / \ 20(4) 26(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 19 16 11 20 26 29 32 42 47 50
sink seventh call, second loop; parent == 2, size == 3
parent * 2 (4) is not less than or equal to size; exit loop.
swap element 1 with element 3
______11(1)______ / \ ___16(2)___ 19(3) / \ / \ 20(4) 26(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 11 16 19 20 26 29 32 42 47 50
decrement size; size == 2
sink eighth call, first loop; parent == 1, size == 2
REMEMBER that size limits to only looking at left child
11 has one child, 16. 16 is picked as the largest child.
11 is less than 16, so swap.
______16(1)______ / \ ___11(2)___ 19(3) / \ / \ 20(4) 26(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 16 11 19 20 26 29 32 42 47 50
sink eighth call, second loop; parent == 2, size == 2
parent * 2 (4) is not less than or equal to size; exit loop.
swap element 1 with element 2
______11(1)______ / \ ___16(2)___ 19(3) / \ / \ 20(4) 26(5) 29(6) 32(7) / \ / 42(8) 47(9) 50(10) index 0 1 2 3 4 5 6 7 8 9 10 value 0 11 16 19 20 26 29 32 42 47 50
decrement size; size == 1
sink ninth call, first loop; parent == 1, size == 1
parent * 2 (2) is not less than or equal to size; exit loop.
And exit second for loop of Sort, because size is no longer greater than 1.
We now have no more heap left, and a sorted array!