How to create a MySQL hierarchical recursive query?

Learn how to create a mysql hierarchical recursive query? with practical examples, diagrams, and best practices. Covers mysql, sql, hierarchical-data development techniques with visual explanations.

Mastering Hierarchical Data with MySQL Recursive CTEs

Hero image for How to create a MySQL hierarchical recursive query?

Explore how to effectively query hierarchical data structures in MySQL using recursive Common Table Expressions (CTEs), a powerful feature for managing parent-child relationships.

Hierarchical data, such as organizational charts, file systems, or threaded comments, is a common challenge in database management. Traditionally, querying this type of data in MySQL involved complex self-joins or application-level logic. However, with the introduction of Recursive Common Table Expressions (CTEs) in MySQL 8.0, handling hierarchical queries has become significantly more elegant and efficient. This article will guide you through the process of creating and utilizing recursive CTEs to navigate and retrieve data from hierarchical structures.

Understanding Hierarchical Data and Recursive CTEs

Hierarchical data is characterized by parent-child relationships, where each child can have only one parent, but a parent can have multiple children. A classic example is an employee table where each employee reports to a manager, who is also an employee. Recursive CTEs allow you to define a query that refers to itself, enabling the traversal of these relationships level by level until a base condition is met. This makes them ideal for tasks like finding all descendants of a node or tracing the ancestry of an item.

erDiagram
    EMPLOYEE ||--o{ EMPLOYEE : "reports to"
    EMPLOYEE {
        INT employee_id PK
        VARCHAR name
        INT manager_id FK
        VARCHAR position
    }

Entity-Relationship Diagram for a self-referencing Employee table

Setting Up Your Hierarchical Data

Before we dive into recursive queries, let's set up a sample table representing an organizational hierarchy. We'll create an employees table where each employee has an employee_id, a name, and a manager_id that points to their direct manager. The manager_id will be NULL for the top-level employee (e.g., the CEO).

CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    manager_id INT,
    position VARCHAR(100),
    FOREIGN KEY (manager_id) REFERENCES employees(employee_id)
);

INSERT INTO employees (employee_id, name, manager_id, position) VALUES
(1, 'Alice', NULL, 'CEO'),
(2, 'Bob', 1, 'CTO'),
(3, 'Charlie', 1, 'CFO'),
(4, 'David', 2, 'Lead Developer'),
(5, 'Eve', 2, 'Lead QA'),
(6, 'Frank', 3, 'Accountant'),
(7, 'Grace', 4, 'Developer'),
(8, 'Heidi', 4, 'Developer');

SQL schema and sample data for the employees table

Constructing a Recursive CTE for Descendants

A recursive CTE consists of two main parts: an anchor member and a recursive member, connected by a UNION ALL operator. The anchor member establishes the base result set, and the recursive member repeatedly joins with the result of the previous iteration until no new rows are produced. Let's find all employees who report directly or indirectly to 'Alice' (employee_id = 1).

WITH RECURSIVE EmployeeHierarchy AS (
    -- Anchor member: Select the top-level employee (Alice)
    SELECT
        employee_id,
        name,
        manager_id,
        position,
        0 AS level
    FROM
        employees
    WHERE
        employee_id = 1

    UNION ALL

    -- Recursive member: Join with the CTE to find direct reports
    SELECT
        e.employee_id,
        e.name,
        e.manager_id,
        e.position,
        eh.level + 1 AS level
    FROM
        employees e
    INNER JOIN
        EmployeeHierarchy eh ON e.manager_id = eh.employee_id
)
SELECT
    employee_id,
    name,
    position,
    level
FROM
    EmployeeHierarchy
ORDER BY
    level, employee_id;

Recursive CTE to find all descendants of a specific employee

Tracing Ancestry with Recursive CTEs

Recursive CTEs can also be used to trace the path upwards, from a child node to its ultimate ancestor. This is useful for understanding the reporting chain for a specific employee. Let's find the entire management chain for 'Grace' (employee_id = 7).

WITH RECURSIVE EmployeeAncestry AS (
    -- Anchor member: Select the starting employee (Grace)
    SELECT
        employee_id,
        name,
        manager_id,
        position,
        0 AS level
    FROM
        employees
    WHERE
        employee_id = 7

    UNION ALL

    -- Recursive member: Join with the CTE to find the manager
    SELECT
        e.employee_id,
        e.name,
        e.manager_id,
        e.position,
        ea.level + 1 AS level
    FROM
        employees e
    INNER JOIN
        EmployeeAncestry ea ON e.employee_id = ea.manager_id
)
SELECT
    employee_id,
    name,
    position,
    level
FROM
    EmployeeAncestry
ORDER BY
    level DESC;

Recursive CTE to trace the ancestry of a specific employee