What's the difference between RANK() and DENSE_RANK() functions in oracle?

Learn what's the difference between rank() and dense_rank() functions in oracle? with practical examples, diagrams, and best practices. Covers sql, oracle-database, window-functions development tec...

RANK() vs. DENSE_RANK(): Understanding Oracle's Window Functions for Ranking Data

A visual representation of data rows being ranked, with some rows having ties and different ranking outcomes for RANK() and DENSE_RANK().

Explore the key differences between Oracle's RANK() and DENSE_RANK() window functions, learn how they handle ties, and discover practical use cases with clear examples.

When working with analytical queries in Oracle SQL, ranking functions are indispensable for assigning a numerical rank to each row within a partition based on specified ordering criteria. Among the most commonly used are RANK() and DENSE_RANK(). While both serve to rank data, their behavior when encountering tied values is fundamentally different, leading to distinct results that are crucial to understand for accurate data analysis.

Introduction to Window Functions and Ranking

Window functions perform calculations across a set of table rows that are somehow related to the current row. Unlike aggregate functions, window functions do not collapse rows; instead, they return a value for each row in the result set. Ranking functions are a specific type of window function that assign a rank to each row within its partition. The basic syntax for these functions involves an OVER() clause, which defines the window (or partition) of rows on which the function operates and the order within that window.

SELECT
    column1,
    column2,
    RANK() OVER (PARTITION BY column1 ORDER BY column2 DESC) AS rank_value,
    DENSE_RANK() OVER (PARTITION BY column1 ORDER BY column2 DESC) AS dense_rank_value
FROM
    your_table;

Basic syntax for RANK() and DENSE_RANK() with PARTITION BY and ORDER BY clauses.

Understanding RANK()

The RANK() function assigns a unique rank to each distinct row within its partition. If two or more rows have identical values in the ORDER BY clause (i.e., they are 'tied'), they receive the same rank. However, RANK() then skips the next rank numbers, effectively creating gaps in the ranking sequence. The number of skipped ranks corresponds to the number of tied rows minus one. For example, if two rows are tied for rank 2, both get rank 2, and the next distinct rank will be 4 (skipping 3).

flowchart TD
    A[Start Ranking] --> B{Process Row}
    B --"Order By Value"--> C{Is current row tied with previous?}
    C --"Yes"--> D[Assign same rank as previous]
    C --"No"--> E[Assign next available rank, skipping ranks if ties occurred]
    D --> F[Move to next row]
    E --> F
    F --"More Rows?"--> B
    F --"No"--> G[End Ranking]

Flowchart illustrating the logic of the RANK() function, including rank skipping for ties.

Understanding DENSE_RANK()

In contrast, the DENSE_RANK() function also assigns a unique rank to each distinct row within its partition. Like RANK(), tied rows receive the same rank. The crucial difference is that DENSE_RANK() does not skip any rank numbers after a tie. It assigns consecutive ranks to the distinct values. If two rows are tied for rank 2, both get rank 2, and the next distinct rank will be 3 (no ranks are skipped). This results in a continuous sequence of ranks without gaps.

flowchart TD
    A[Start Ranking] --> B{Process Row}
    B --"Order By Value"--> C{Is current row tied with previous?}
    C --"Yes"--> D[Assign same rank as previous]
    C --"No"--> E[Assign next consecutive rank (no skipping)]
    D --> F[Move to next row]
    E --> F
    F --"More Rows?"--> B
    F --"No"--> G[End Ranking]

Flowchart illustrating the logic of the DENSE_RANK() function, showing no rank skipping for ties.

Practical Example: Employee Salaries

Let's consider a table of employees with their departments and salaries. We want to rank employees by salary within each department. This example will clearly demonstrate the difference between RANK() and DENSE_RANK().

CREATE TABLE Employees (
    EmployeeID NUMBER PRIMARY KEY,
    Department VARCHAR2(50),
    Salary NUMBER
);

INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (1, 'Sales', 50000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (2, 'Sales', 60000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (3, 'Sales', 60000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (4, 'Sales', 70000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (5, 'Marketing', 55000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (6, 'Marketing', 55000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (7, 'Marketing', 65000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (8, 'HR', 45000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (9, 'HR', 50000);
INSERT INTO Employees (EmployeeID, Department, Salary) VALUES (10, 'HR', 50000);

SQL script to create the Employees table and insert sample data.

SELECT
    EmployeeID,
    Department,
    Salary,
    RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS Rank_Value,
    DENSE_RANK() OVER (PARTITION BY Department ORDER BY Salary DESC) AS Dense_Rank_Value
FROM
    Employees
ORDER BY
    Department, Salary DESC;

Query demonstrating RANK() and DENSE_RANK() on the Employees table.

Let's analyze the output for the 'Sales' department:

EmployeeID	Department	Salary	Rank_Value	Dense_Rank_Value
4	Sales	70000	1	1
2	Sales	60000	2	2
3	Sales	60000	2	2
1	Sales	50000	4	3

Notice how for the 'Sales' department, employees with Salary 60000 both get rank 2. With RANK(), the next salary (50000) gets rank 4, skipping rank 3. With DENSE_RANK(), the next salary (50000) gets rank 3, maintaining a dense sequence.

💡

Always consider how ties should be handled in your ranking logic. If you need a continuous sequence of ranks without gaps, DENSE_RANK() is your choice. If gaps are acceptable or even desired to reflect the 'skipped' positions due to ties, RANK() is more appropriate.

When to Use Which Function

The choice between RANK() and DENSE_RANK() depends entirely on the specific business requirement:

Use RANK() when:
- You need to identify the top N items, and ties should push subsequent items further down the list (e.g., top 3 finishers in a race where ties mean the next person is 4th, not 3rd).
- You want to see the 'true' position of an item if all tied items were considered to occupy their respective positions.
Use DENSE_RANK() when:
- You need a continuous sequence of ranks without any gaps, regardless of ties (e.g., assigning medal tiers where multiple people can get gold, and the next tier is silver, not bronze).
- You are categorizing items into distinct rank groups (e.g., 'top tier', 'second tier', etc.) and want these tiers to be sequential.

ℹ️

Oracle also provides ROW_NUMBER() and NTILE() as other ranking functions. ROW_NUMBER() assigns a unique, sequential integer to each row within its partition, even for ties (the order of tied rows is arbitrary unless further ORDER BY criteria are specified). NTILE() divides rows into a specified number of groups.