🚧 We are still building the lighthouse. Hang tight!

projectlighthouse logoprojectlighthouse

Sign in with your Google or GitHub account to get started

Heapify

Building a heap from an array in linear time. Understand why heapify is O(n), not O(n log n).

Building a heap from an array can be done in two ways: by inserting elements one by one (O(n log n)) or by using heapify (O(n)). The heapify algorithm is surprisingly efficient and forms the basis for heap sort.

Two Approaches

Approach 1: Repeated Insertion

Insert elements one by one:

fn build_heap_insertions(arr: Vec<i32>) -> Vec<i32> {
    let mut heap = Vec::new();
    for val in arr {
        heap.push(val);
        sift_up(&mut heap, heap.len() - 1);
    }
    heap
}

Time Complexity: O(n log n)

  • Each insertion is O(log n)
  • n insertions = O(n log n)

Approach 2: Heapify (Floyd's Method)

Start from the last non-leaf node and sift down each node:

fn heapify(arr: &mut [i32]) {
    let n = arr.len();
    // Start from last non-leaf node
    for i in (0..n / 2).rev() {
        sift_down(arr, i, n);
    }
}

Time Complexity: O(n) — surprisingly linear!

Why Start from n//2 - 1?

Nodes from index n//2 to n-1 are leaves (they have no children). Leaves already satisfy the heap property trivially, so we only need to process internal nodes.

Heapify Visualization

Array: [4, 10, 3, 5, 1]

Initial tree:
        4
       / \
      10  3
     / \
    5   1

n = 5, start at index n//2 - 1 = 1

Step 1: Sift down index 1 (value 10)
        10 > max(5, 1) = 5, so 10 stays
        (No swap needed)

Step 2: Sift down index 0 (value 4)
        max child = 10
        4 < 10, swap!
        10
       / \
      4   3
     / \
    5   1

        Now check 4 at index 1
        max child = 5
        4 < 5, swap!
        10
       / \
      5   3
     / \
    4   1

Result: [10, 5, 3, 4, 1] — a valid max-heap!

Why O(n)?

The intuition that heapify is O(n log n) is wrong. Here's why it's actually O(n):

Most nodes are near the bottom of the tree:

  • Half the nodes are leaves (0 sifts)
  • Quarter of nodes are one level up (at most 1 sift)
  • Eighth of nodes are two levels up (at most 2 sifts)
  • ...
For a heap of height h:

Level 0 (root):      1 node,  may sift h levels
Level 1:             2 nodes, may sift h-1 levels
Level 2:             4 nodes, may sift h-2 levels
...
Level h-1:           2^(h-1) nodes, may sift 1 level
Level h (leaves):    2^h nodes, sift 0 levels

Total work = Σ (nodes at level i) × (h - i)
           = Σ 2^i × (h - i)
           = O(n)

The math works out to linear time!

Sift Down, Not Sift Up

Heapify uses sift down from internal nodes, not sift up. Using sift up from index 0 would give O(n log n). The key insight is that sifting down from leaves is cheap (short paths), and most nodes are leaves.

Complete Heapify Implementation

fn heapify_max(arr: &mut [i32]) {
    // Convert array to max-heap in-place
    let n = arr.len();

    // Start from last non-leaf and go up
    for i in (0..n / 2).rev() {
        sift_down_max(arr, i, n);
    }
}

fn sift_down_max(arr: &mut [i32], mut i: usize, n: usize) {
    // Sift down for max-heap
    loop {
        let mut largest = i;
        let left = 2 * i + 1;
        let right = 2 * i + 2;

        if left < n && arr[left] > arr[largest] {
            largest = left;
        }
        if right < n && arr[right] > arr[largest] {
            largest = right;
        }

        if largest == i {
            break;
        }

        arr.swap(i, largest);
        i = largest;
    }
}

// Usage
fn main() {
    let mut arr = vec![4, 10, 3, 5, 1, 7, 8];
    heapify_max(&mut arr);
    println!("{:?}", arr);  // [10, 5, 8, 4, 1, 7, 3]
}

For min-heap:

fn heapify_min(arr: &mut [i32]) {
    let n = arr.len();
    for i in (0..n / 2).rev() {
        sift_down_min(arr, i, n);
    }
}

fn sift_down_min(arr: &mut [i32], mut i: usize, n: usize) {
    loop {
        let mut smallest = i;
        let left = 2 * i + 1;
        let right = 2 * i + 2;

        if left < n && arr[left] < arr[smallest] {
            smallest = left;
        }
        if right < n && arr[right] < arr[smallest] {
            smallest = right;
        }

        if smallest == i {
            break;
        }

        arr.swap(i, smallest);
        i = smallest;
    }
}

Rust's from_iter for BinaryHeap

use std::collections::BinaryHeap;

let arr = vec![4, 10, 3, 5, 1, 7, 8];
let heap: BinaryHeap<_> = arr.into_iter().collect();  // Creates a max-heap
let sorted: Vec<_> = heap.into_sorted_vec();
println!("{:?}", sorted);  // [1, 3, 4, 5, 7, 8, 10]

Heap Sort

Heapify enables an elegant sorting algorithm:

fn heap_sort(arr: &mut [i32]) {
    let n = arr.len();

    // Build max-heap
    for i in (0..n / 2).rev() {
        sift_down_max(arr, i, n);
    }

    // Extract elements one by one
    for i in (1..n).rev() {
        // Move current root to end
        arr.swap(0, i);

        // Sift down on reduced heap
        sift_down_max(arr, 0, i);
    }
}

fn main() {
    let mut arr = vec![4, 10, 3, 5, 1, 7, 8];
    heap_sort(&mut arr);
    println!("{:?}", arr);  // [1, 3, 4, 5, 7, 8, 10]
}

Heap Sort Complexity:

  • Time: O(n log n) - heapify O(n) + n extractions O(n log n)
  • Space: O(1) - in-place
  • Not stable
Heap sort visualization:

Initial:     [4, 10, 3, 5, 1, 7, 8]

After heapify (max-heap):
             [10, 5, 8, 4, 1, 7, 3]

Extraction steps:
1. Swap 10 with 3, sift down: [8, 5, 7, 4, 1, 3, |10]
2. Swap 8 with 3, sift down:  [7, 5, 3, 4, 1, |8, 10]
3. Swap 7 with 1, sift down:  [5, 4, 3, 1, |7, 8, 10]
4. Swap 5 with 1, sift down:  [4, 1, 3, |5, 7, 8, 10]
5. Swap 4 with 3, sift down:  [3, 1, |4, 5, 7, 8, 10]
6. Swap 3 with 1, sift down:  [1, |3, 4, 5, 7, 8, 10]

Sorted!

nlargest and nsmallest

Rust can use BinaryHeap for efficient top-k operations:

use std::collections::BinaryHeap;
use std::cmp::Reverse;

fn nlargest(arr: &[i32], k: usize) -> Vec<i32> {
    let mut heap = BinaryHeap::from(arr.to_vec());
    (0..k).filter_map(|_| heap.pop()).collect()
}

fn nsmallest(arr: &[i32], k: usize) -> Vec<i32> {
    let mut heap: BinaryHeap<_> = arr.iter().map(|&x| Reverse(x)).collect();
    (0..k).filter_map(|_| heap.pop()).map(|Reverse(x)| x).collect()
}

fn main() {
    let arr = vec![4, 10, 3, 5, 1, 7, 8, 2, 9, 6];

    println!("{:?}", nlargest(&arr, 3));   // [10, 9, 8]
    println!("{:?}", nsmallest(&arr, 3));  // [1, 2, 3]
}

Time Complexity:

  • For k << n: O(n + k log n) using heapify + k extractions
  • More efficient than sorting when k is small

Comparison of Building Methods

| Method | Time | When to Use | |--------|------|-------------| | Repeated insertion | O(n log n) | When elements arrive one by one | | Heapify | O(n) | When you have all elements upfront |

Applications of Heapify

  1. Heap Sort: Build heap, then extract all elements
  2. Top-K Elements: Heapify then extract k times
  3. Priority Queue Initialization: Build from existing data
  4. Median Finding: Using two heaps
  5. K-way Merge: Initial heap from k lists

Key Takeaways

  1. Heapify converts an array to a heap in O(n) time
  2. Start from the last non-leaf node and sift down
  3. Most nodes are near leaves, so most sifts are short
  4. Heapify is the foundation of heap sort
  5. In-place operation — no extra space needed
  6. Rust's BinaryHeap::from() or .collect() does this for max-heaps

The O(n) heapify is a beautiful result in algorithm analysis. Understanding why it's linear (most work happens on short paths near leaves) deepens your understanding of algorithm complexity.

Notes & Highlights

© 2025 projectlighthouse. All rights reserved.