Time Complexity of the Kruskal Algorithm?

Learn time complexity of the kruskal algorithm? with practical examples, diagrams, and best practices. Covers algorithm, time-complexity, graph-algorithm development techniques with visual explanat...

Understanding the Time Complexity of Kruskal's Algorithm

Hero image for Time Complexity of the Kruskal Algorithm?

Explore the time complexity of Kruskal's algorithm for finding Minimum Spanning Trees, breaking down its components and analyzing its efficiency.

Kruskal's algorithm is a greedy algorithm used to find a Minimum Spanning Tree (MST) for a connected, undirected graph. An MST is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. Understanding its time complexity is crucial for evaluating its performance on large datasets.

Core Steps of Kruskal's Algorithm

Kruskal's algorithm operates by iteratively adding the cheapest available edge that does not form a cycle with the already added edges. This process continues until all vertices are connected. The algorithm relies heavily on a Disjoint Set Union (DSU) data structure to efficiently detect cycles.

flowchart TD
    A[Start Kruskal's Algorithm]
    B{Sort all edges by weight in non-decreasing order}
    C[Initialize Disjoint Set Union (DSU) structure for all vertices]
    D{Iterate through sorted edges}
    E{Pick the next cheapest edge (u, v)}
    F{Are u and v already in the same set?}
    G[Add edge (u, v) to MST]
    H[Union the sets containing u and v]
    I{All vertices connected or all edges processed?}
    J[End Algorithm]

    A --> B
    B --> C
    C --> D
    D --> E
    E --> F
    F -- Yes --> D
    F -- No --> G
    G --> H
    H --> I
    I -- No --> D
    I -- Yes --> J

Flowchart of Kruskal's Algorithm Steps

Analyzing Time Complexity Components

The overall time complexity of Kruskal's algorithm is determined by the sum of the complexities of its individual steps. Let V be the number of vertices and E be the number of edges in the graph.

  1. Sorting Edges: The first and often most dominant step is sorting all E edges by their weights. Using an efficient sorting algorithm like Merge Sort or Heap Sort, this operation takes O(E log E) time.

  2. Disjoint Set Union (DSU) Operations:

    • Initialization: Initializing the DSU structure for V vertices takes O(V) time, as each vertex is initially in its own set.
    • find and union operations: For each of the E edges, we perform two find operations (to check if u and v are in the same set) and potentially one union operation (if they are not). With path compression and union by rank/size optimizations, the amortized time complexity for these operations is nearly constant, often denoted as O(α(V)), where α is the inverse Ackermann function, which grows extremely slowly and is practically less than 5 for any realistic graph size. Therefore, E DSU operations take approximately O(E α(V)) time.

Overall Time Complexity

Combining these components, the total time complexity of Kruskal's algorithm is:

T(V, E) = O(E log E) + O(V) + O(E α(V))

Since E log E typically dominates V and E α(V) (especially for dense graphs where E is large, or even sparse graphs where E is at least V-1), the overall time complexity simplifies to O(E log E).

In cases where E is very small (e.g., E < V), the O(V) initialization might seem significant, but E log E still holds as log E would be small. For connected graphs, E is at least V-1. The maximum number of edges E in a simple graph is V(V-1)/2, which is O(V^2). In this worst-case scenario, log E would be log(V^2) = 2 log V, so O(E log E) becomes O(V^2 log V).

class DSU:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, i):
        if self.parent[i] == i:
            return i
        self.parent[i] = self.find(self.parent[i])
        return self.parent[i]

    def union(self, i, j):
        root_i = self.find(i)
        root_j = self.find(j)

        if root_i != root_j:
            if self.rank[root_i] < self.rank[root_j]:
                self.parent[root_i] = root_j
            elif self.rank[root_i] > self.rank[root_j]:
                self.parent[root_j] = root_i
            else:
                self.parent[root_j] = root_i
                self.rank[root_i] += 1
            return True
        return False

def kruskal(num_vertices, edges):
    # edges: list of (weight, u, v)
    edges.sort() # O(E log E)

    dsu = DSU(num_vertices) # O(V)
    mst_cost = 0
    mst_edges = []
    edges_count = 0

    for weight, u, v in edges:
        if dsu.union(u, v): # O(alpha(V)) amortized per union/find
            mst_cost += weight
            mst_edges.append((u, v, weight))
            edges_count += 1
            if edges_count == num_vertices - 1:
                break

    return mst_cost, mst_edges

# Example Usage:
# num_vertices = 4
# edges = [
#     (1, 0, 1),
#     (3, 0, 2),
#     (4, 1, 2),
#     (2, 1, 3),
#     (5, 2, 3)
# ]
# cost, mst = kruskal(num_vertices, edges)
# print(f"MST Cost: {cost}")
# print(f"MST Edges: {mst}")