Proof of correctness: Algorithm for diameter of a tree in graph theory

Learn proof of correctness: algorithm for diameter of a tree in graph theory with practical examples, diagrams, and best practices. Covers algorithm, tree, graph-theory development techniques with ...

Proof of Correctness: Algorithm for Tree Diameter in Graph Theory

Abstract representation of a tree graph with its diameter highlighted, symbolizing the longest path.

Explore the algorithm for finding the diameter of a tree and delve into its formal proof of correctness, ensuring its reliability and efficiency.

The diameter of a tree is defined as the longest path between any two nodes in the tree. Finding this path is a fundamental problem in graph theory with applications in network design, phylogenetic tree analysis, and more. While several algorithms exist, a common and efficient approach involves two Breadth-First Search (BFS) traversals. This article will detail this algorithm and provide a rigorous proof of its correctness.

The Two-BFS Algorithm for Tree Diameter

The algorithm to find the diameter of a tree is surprisingly simple yet effective. It leverages the properties of trees and BFS to identify the two endpoints of the longest path. The core idea is that one endpoint of any longest path must be one of the farthest nodes from an arbitrary starting point. Once one such endpoint is found, the farthest node from it will be the other endpoint of a diameter.

flowchart TD
    A[Start: Pick an arbitrary node 'u'] --> B{Perform BFS from 'u'}
    B --> C[Find node 'x' farthest from 'u']
    C --> D{Perform BFS from 'x'}
    D --> E[Find node 'y' farthest from 'x']
    E --> F[The path between 'x' and 'y' is a diameter]
    F --> G[End]

Flowchart of the Two-BFS Algorithm for Tree Diameter

Let's break down the steps of the algorithm:

1. First BFS

Choose an arbitrary node u in the tree. Perform a Breadth-First Search (BFS) starting from u. During this BFS, keep track of the distance of each node from u. Identify the node x that is farthest from u.

2. Second BFS

Perform another BFS, this time starting from node x (the farthest node found in the first BFS). Again, track the distances of all nodes from x. Identify the node y that is farthest from x.

3. Result

The path between x and y is a diameter of the tree, and the distance between x and y is the length of the diameter.

Proof of Correctness

To prove the correctness of the two-BFS algorithm, we need to show that the path found between x and y is indeed a longest path in the tree. The proof relies on two key lemmas.

Lemma 1: Let P be any longest path in a tree, and let u be an arbitrary node. If we perform a BFS from u and find the node x farthest from u, then x must be an endpoint of some longest path.

Proof of Lemma 1: Assume for contradiction that x is not an endpoint of any longest path. Let P_max = (a, ..., b) be a longest path in the tree. Since x is not an endpoint of P_max, x must be outside P_max or an internal node of P_max. If x is an internal node, then dist(u, x) would be less than dist(u, a) or dist(u, b) (unless u is also on P_max and x is one of its endpoints, which contradicts x not being an endpoint of any longest path). If x is outside P_max, let c be the node on P_max closest to x. Then the path (u, ..., x) must intersect P_max at some point. Consider the path (x, ..., c, ..., a) or (x, ..., c, ..., b). One of these paths must be longer than (u, ..., x) if x is not an endpoint of a longest path, which contradicts x being the farthest node from u. Therefore, x must be an endpoint of some longest path.

Lemma 2: If x is an endpoint of a longest path P_max = (x, ..., y'), then the node y farthest from x (found in the second BFS) must be y' (the other endpoint of P_max).

Proof of Lemma 2: Assume for contradiction that y is not y'. This means there exists a path (x, ..., y) such that dist(x, y) > dist(x, y'). But (x, ..., y') is a longest path by definition. If dist(x, y) > dist(x, y'), then (x, ..., y) would be a path longer than P_max, which contradicts P_max being a longest path. Therefore, y must be y', and the path (x, ..., y) is indeed a longest path.

💡

The two-BFS algorithm works because in a tree, the path between any two nodes is unique. This uniqueness simplifies distance calculations and path identification.

Algorithm Implementation (Pseudocode)

Here's a pseudocode representation of the algorithm. The BFS function returns the farthest node and its distance from the source.

function BFS(graph, start_node):
    distances = {} # Stores distance from start_node
    queue = []
    
    for node in graph.nodes():
        distances[node] = infinity
    
    distances[start_node] = 0
    queue.append(start_node)
    
    farthest_node = start_node
    max_distance = 0
    
    while queue is not empty:
        current_node = queue.pop(0)
        
        if distances[current_node] > max_distance:
            max_distance = distances[current_node]
            farthest_node = current_node
            
        for neighbor in graph.neighbors(current_node):
            if distances[neighbor] == infinity:
                distances[neighbor] = distances[current_node] + 1
                queue.append(neighbor)
                
    return farthest_node, max_distance

function find_tree_diameter(graph):
    # Step 1: First BFS from an arbitrary node (e.g., node 0)
    node_u = graph.nodes()[0] # Assuming nodes are indexed or iterable
    farthest_from_u, _ = BFS(graph, node_u)
    
    # Step 2: Second BFS from the farthest node found
    farthest_from_farthest, diameter_length = BFS(graph, farthest_from_u)
    
    return diameter_length, farthest_from_u, farthest_from_farthest

Pseudocode for the Two-BFS Tree Diameter Algorithm

ℹ️

The time complexity of this algorithm is O(V + E), where V is the number of vertices and E is the number of edges, because it performs two BFS traversals, each taking O(V + E) time. For a tree, E = V - 1, so the complexity is O(V).

Proof of correctness: Algorithm for diameter of a tree in graph theory

Tags:

Categories:

Proof of Correctness: Algorithm for Tree Diameter in Graph Theory

The Two-BFS Algorithm for Tree Diameter

1. First BFS

2. Second BFS

3. Result

Proof of Correctness

Algorithm Implementation (Pseudocode)