Skip to main content

Heap Fundamentals

TL;DR

Binary heaps are complete binary trees stored in arrays where parent ≤ children (min-heap) or parent ≥ children (max-heap). Heap operations maintain partial ordering: insertion and deletion run O(log n) via sift operations. Heapify builds heap from array in O(n)—more efficient than n insertions. Arrays represent heaps compactly: parent at i has children at 2i+1 and 2i+2. Understanding sift-up, sift-down, and heapify operations enables efficient priority queues and sorting algorithms.

Core Concepts

Min-Heap and Max-Heap Properties

Min-heap property requires parent ≤ children for every node. Root contains minimum element. Subtrees are also min-heaps.

Max-heap property requires parent ≥ children. Root contains maximum element. Useful for priority queues where higher priority items come first.

Partial ordering vs full sorting: heaps don't sort entire array, only maintaining root as minimum/maximum. Enables efficient top-K problems without full sorting.

Complete binary tree ensures height O(log n). All levels filled except last, which fills left-to-right. Enables compact array representation and height guarantee.

Heap Operations and Complexity

Insert appends element to end, then sift-up. New element bubbles up while violating heap property. O(log n) in worst case when rising to root.

Extract removes root, replaces with last element, then sift-down. O(log n) due to sift operation.

Peek returns root without modification. O(1).

Sift-up compares element to parent. If violating property, swap and continue up. Stops when property satisfied or reaching root.

Sift-down compares element to children. If violating property, swap with smaller/larger child (depending on min/max), continue down. More efficient implementation compares to both children before swapping.

Heapify and Build Heap

Heapify from array converts arbitrary array to heap. O(n) time, more efficient than n insertions (which is O(n log n)).

Algorithm iterates from last non-leaf downward, sifting each. Last non-leaf is at index (n-1)/2 (parent of last leaf).

Why O(n)? Upper levels contain fewer elements; sift cost amortizes. Last level has n/2 elements sifting 1 level; next level has n/4 elements sifting 2 levels, etc. Total: n/2 * 1 + n/4 * 2 + n/8 * 3 + ... = O(n).

Heap sort application builds heap, extracts n times. O(n log n) but not stable.

Array Representation

Index formula for node at index i:

  • Parent: (i-1)/2
  • Left child: 2i+1
  • Right child: 2i+2

0-indexed arrays assume root at index 0. Simplifies formulas.

1-indexed arrays use parent at i, children at 2i and 2i+1. Sometimes cleaner but wastes space.

Memory efficiency stores complete tree compactly. No wasted space for full tree representation versus linked nodes.

Key Algorithms and Techniques

Sift-Up for Insertion

Element bubbles up while parent violates min-heap property.

Sift-Down for Deletion

Root replaced with last element; sifts down to restore property.

Heapify from Array

Iterate non-leaf nodes backward, sifting each in O(n) total.

Heap Sort

Build heap O(n), extract n times O(n log n). Total O(n log n).

Practical Example

class MinHeap:
def __init__(self):
self.heap = []

def parent(self, i):
return (i - 1) // 2

def left_child(self, i):
return 2 * i + 1

def right_child(self, i):
return 2 * i + 2

def sift_up(self, i):
"""Restore heap property by moving up."""
while i > 0 and self.heap[i] < self.heap[self.parent(i)]:
# Swap with parent
self.heap[i], self.heap[self.parent(i)] = self.heap[self.parent(i)], self.heap[i]
i = self.parent(i)

def sift_down(self, i):
"""Restore heap property by moving down."""
smallest = i
left = self.left_child(i)
right = self.right_child(i)

if left < len(self.heap) and self.heap[left] < self.heap[smallest]:
smallest = left
if right < len(self.heap) and self.heap[right] < self.heap[smallest]:
smallest = right

if smallest != i:
self.heap[i], self.heap[smallest] = self.heap[smallest], self.heap[i]
self.sift_down(smallest)

def insert(self, val):
"""Insert element maintaining heap property."""
self.heap.append(val)
self.sift_up(len(self.heap) - 1)

def extract_min(self):
"""Remove and return minimum element."""
if len(self.heap) == 0:
return None
root = self.heap[0]
self.heap[0] = self.heap[-1]
self.heap.pop()
if self.heap:
self.sift_down(0)
return root

def peek(self):
"""Return minimum without removing."""
return self.heap[0] if self.heap else None

def build_heap(self, arr):
"""Build heap from array in O(n)."""
self.heap = arr.copy()
for i in range(len(self.heap) // 2 - 1, -1, -1):
self.sift_down(i)

def heap_sort(self, arr):
"""Sort array using heap."""
self.build_heap(arr)
result = []
while self.heap:
result.append(self.extract_min())
return result

# Test
heap = MinHeap()
for val in [3, 1, 4, 1, 5, 9, 2, 6]:
heap.insert(val)

print(f"Min: {heap.peek()}")
print(f"Heap sort [3,1,4,1,5,9,2,6]: {heap.heap_sort([3,1,4,1,5,9,2,6])}")

heap2 = MinHeap()
heap2.build_heap([9, 5, 6, 2, 3])
print(f"Built heap from [9,5,6,2,3]: {heap2.heap}")

Common Pitfalls

warning

Index formula errors: Using 2i/2i+1 instead of 2i+1/2i+2 causes incorrect tree structure. Verify formulas with example.

Not sifting after insert/delete: Adding element at end without sifting-up breaks heap property. Must restore after any addition.

Forgetting to handle last element replacement: Deletion must move last element to deleted node's position, then sift. Deleting node without replacement leaves gap.

Heapify starting from wrong index: Starting from index 0 instead of last non-leaf (n/2-1) is inefficient. Correct formula is necessary for O(n) complexity.

Heap sort stability confusion: Heapsort is not stable. Use merge sort if stability required.

Self-Check

  1. Explain sift-up and sift-down operations. When does each apply?
  2. Why is heapify from array O(n) instead of O(n log n)? Work through cost analysis.
  3. How do index formulas map complete binary tree to array? Why is 0-indexing cleaner?

One Takeaway

info

Binary heaps combine simplicity with efficiency: array representation, O(log n) operations, and O(n) build time. Understanding sift operations and index formulas reveals elegance in priority queue implementation. Heaps are foundational for sorting, scheduling, and selection problems.

References

  • Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms (3rd ed.). MIT Press.
  • McDowell, G. L. (2015). Cracking the Coding Interview (6th ed.). CareerCup.