heapify VS build heap
Categories:
Heapify vs. Build Heap: Understanding the Core Operations of Heaps
Explore the fundamental differences between the heapify
operation and the build heap
algorithm, crucial for efficient heap data structure manipulation.
Heaps are a fundamental data structure in computer science, widely used in priority queues, heap sort, and graph algorithms. At their core, heaps maintain a specific ordering property (min-heap or max-heap). Two key operations, heapify
and build heap
, are often confused but serve distinct purposes in managing this property. This article will clarify their roles, mechanisms, and applications.
What is Heapify?
Heapify
is a procedure that maintains the heap property. It assumes that the binary trees rooted at the children of a given node are already heaps, but the node itself might violate the heap property. The heapify
operation's goal is to correct this violation by moving the element at the current node down the tree until the heap property is restored. This involves comparing the parent with its children and swapping if necessary, then recursively calling heapify
on the affected child.
Flowchart of the Heapify operation
def heapify(arr, n, i):
largest = i # Initialize largest as root
l = 2 * i + 1 # left child
r = 2 * i + 2 # right child
# See if left child of root exists and is greater than root
if l < n and arr[i] < arr[l]:
largest = l
# See if right child of root exists and is greater than root
if r < n and arr[largest] < arr[r]:
largest = r
# Change root, if needed
if largest != i:
arr[i], arr[largest] = arr[largest], arr[i] # swap
# Heapify the root.
heapify(arr, n, largest)
Python implementation of the heapify
function (for a max-heap)
heapify
operation has a time complexity of O(log n), where n is the number of elements in the heap. This is because in the worst case, an element might have to travel from the root to a leaf node, which is the height of the tree.What is Build Heap?
The build heap
algorithm, on the other hand, takes an arbitrary array (which may not satisfy the heap property) and transforms it into a valid heap. It achieves this by applying the heapify
operation to all non-leaf nodes in the array, starting from the last non-leaf node and working upwards to the root. This bottom-up approach ensures that when heapify
is called on a node, its children's subtrees are already valid heaps.
Visualizing the Build Heap process from an unsorted array
def build_heap(arr):
n = len(arr)
# Start from the last non-leaf node and go up to the root
# The index of the last non-leaf node is n//2 - 1
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
return arr
# Example usage:
# arr = [4, 10, 3, 5, 1]
# build_heap(arr)
# print(arr) # Output: [10, 5, 3, 4, 1] (for max-heap)
Python implementation of the build_heap
function using heapify
heapify
multiple times, the build heap
algorithm has an optimal time complexity of O(n). This is a non-trivial result that comes from the fact that most heapify
calls operate on small subtrees near the leaves.Key Differences and Use Cases
The primary distinction lies in their scope and assumptions:
Heapify
: A local operation. It assumes that only the root of a subtree might violate the heap property, and its children's subtrees are already valid heaps. It fixes a single violation.Build Heap
: A global operation. It takes an entire arbitrary array and converts it into a heap by systematically applyingheapify
from the bottom up. It creates a heap from scratch.
Heapify
is used when an element is inserted or deleted from a heap, causing a local violation that needs to be fixed. Build Heap
is used to initialize a heap from an unsorted collection of elements, such as in the first step of Heap Sort.
Heapify
# Used after insertion or deletion to restore heap property
def insert_into_heap(heap_arr, item):
heap_arr.append(item)
i = len(heap_arr) - 1
# 'Bubble up' the new element
while i > 0 and heap_arr[(i - 1) // 2] < heap_arr[i]:
heap_arr[i], heap_arr[(i - 1) // 2] = heap_arr[(i - 1) // 2], heap_arr[i]
i = (i - 1) // 2
# After extracting max/min, heapify is called on the new root
def extract_max(heap_arr):
if not heap_arr: return None
if len(heap_arr) == 1: return heap_arr.pop()
max_val = heap_arr[0]
heap_arr[0] = heap_arr.pop() # Move last element to root
heapify(heap_arr, len(heap_arr), 0) # Restore heap property
return max_val
Build Heap
# Used to convert an arbitrary array into a heap
def build_heap_example(arr):
n = len(arr)
# Apply heapify from the last non-leaf node upwards
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
return arr
# Example of building a heap from an unsorted array
unsorted_array = [12, 11, 13, 5, 6, 7]
max_heap = build_heap_example(unsorted_array)
print(f"Heapified array: {max_heap}") # Expected: [13, 11, 12, 5, 6, 7] or similar valid max-heap