MySQL ROW_NUMBER Function: Guide from Basics to Alternatives

1 1. Introduction
- 1.1 Target Audience
- 1.2 Benefits of the ROW_NUMBER() Function
2 2. What is the ROW_NUMBER function
3 3. Practical Use Cases
4 4. Comparison with Other Window Functions
5 5. Alternatives for MySQL versions below 8.0
6 6. Points to Note and Best Practices
7 7. Summary

1. Introduction

In MySQL version 8.0, many new features have been added, and among them, the support for window functions stands out. This article focuses on the frequently used ROW_NUMBER() function. ROW_NUMBER() function provides powerful capabilities for data analysis and reporting, making it easy to sort and rank data based on specific criteria. This article explains the basic usage of this function, advanced examples, and alternative methods for earlier versions.

Target Audience

Beginners to intermediate users with basic SQL knowledge
Engineers and data analysts who use MySQL for data processing and analysis
Those considering migrating to the latest MySQL version

Benefits of the ROW_NUMBER() Function

This function can assign a unique number to rows based on specific conditions. This enables tasks such as “creating a ranking of highest sales” or “extracting and organizing duplicate data” to be expressed concisely. In earlier versions, complex queries using user variables were required, but using the ROW_NUMBER() function improves code simplicity and readability. In this article, we provide concrete query examples and explanations that are easy for beginners to understand. In the next section, we will examine the function’s basic syntax and behavior in detail.

2. What is the ROW_NUMBER function

The ROW_NUMBER() function, newly added in MySQL 8.0, is a type of window function that assigns a sequential number to each row of data. It provides the ability to number rows according to a specific order or group, which is extremely useful for data analysis and report generation. Here, we will explain it in detail, including its basic syntax and concrete examples.

Basic Syntax of the ROW_NUMBER Function

First, the basic format of the ROW_NUMBER() function is as follows.

SELECT
    column_name,
    ROW_NUMBER() OVER (PARTITION BY group_column ORDER BY order_column) AS row_number
FROM
    table_name;

Meaning of Each Element

ROW_NUMBER(): A function that assigns a sequential number to each row.
OVER: The keyword used to define a window function.
PARTITION BY: Groups the data by the specified column. It is optional; if omitted, the sequential numbers are applied to all rows.
ORDER BY: Specifies the order in which numbers are assigned. This sets the sorting criteria.

Basic Example

For example, suppose there is a table named “sales” with the following data.

employee	department	sale
A	Sales	500
B	Sales	800
C	Development	600
D	Development	700

To assign sequential numbers within each department based on descending sales amounts, use the following query.

SELECT
    employee,
    department,
    sale,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS row_num
FROM
    sales;

Result

employee	department	sale	row_num
B	Sales	800	1
A	Sales	500	2
D	Development	700	1
C	Development	600	2

From this result, we can see that the ranking by sales for each department is displayed.

How to Use PARTITION BY

In the example above, the data is grouped by the “department” column, which results in separate sequential numbers for each department. If you omit PARTITION BY, a single sequence is assigned across all rows.

SELECT
    employee,
    sale,
    ROW_NUMBER() OVER (ORDER BY sale DESC) AS row_num
FROM
    sales;

Result

employee	sale	row_num
B	800	1
D	700	2
C	600	3
A	500	4

Features and Considerations of the ROW_NUMBER() Function

Unique numbering: Even with identical values, the sequential numbers remain unique.
Handling of NULL values: When NULLs are present in ORDER BY, they appear first in ascending order and last in descending order.
Performance impact: In large datasets, ORDER BY can be costly, so setting appropriate indexes is important.

3. Practical Use Cases

Here we introduce concrete scenarios that use MySQL’s ROW_NUMBER() function. This function can be applied in many practical cases such as data ranking and handling duplicate data.

3-1. Ranking Within Groups

For example, consider a case where you want to assign ranks to sales data in descending order of revenue for each department. Below is an example dataset.

employee	department	sale
A	Sales	500
B	Sales	800
C	Development	600
D	Development	700

Query Example: Sales Ranking by Department

SELECT
    employee,
    department,
    sale,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
    sales;

Result:

employee	department	sale	rank
B	Sales	800	1
A	Sales	500	2
D	Development	700	1
C	Development	600	2

In this way, sequential numbers are assigned in sales order for each department, making ranking easy.

3-2. Extracting Top N Records

Next, let’s look at a case where you want to extract the top 3 sales employees for each department. Query Example: Query to Extract Top N Records

WITH RankedSales AS (
    SELECT
        employee,
        department,
        sale,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
    FROM
        sales
)
SELECT
    employee,
    department,
    sale
FROM
    RankedSales
WHERE
    rank <= 3;

Result:

employee	department	sale
B	Sales	800
A	Sales	500
D	Development	700
C	Development	600

In this example, only the top 3 sales data for each department are retrieved. Thus, the ROW_NUMBER() function is suitable not only for ranking but also for filtering top records.

3-3. Extracting and Deleting Duplicate Data

Databases can contain duplicate data. In such cases, you can also handle them easily using the ROW_NUMBER() function. Query Example: Extract Duplicate Data

SELECT *
FROM (
    SELECT
        employee,
        sale,
        ROW_NUMBER() OVER (PARTITION BY employee ORDER BY sale DESC) AS rank
    FROM
        sales
) tmp
WHERE rank > 1;

This query detects duplicates when multiple records exist for the same employee name. Query Example: Delete Duplicate Data

DELETE FROM sales
WHERE id IN (
    SELECT id
    FROM (
        SELECT
            id,
            ROW_NUMBER() OVER (PARTITION BY employee ORDER BY sale DESC) AS rank
        FROM
            sales
    ) tmp
    WHERE rank > 1
);

Summary

ROW_NUMBER() function is,

Ranking within groups
Extracting top N records
Detecting and deleting duplicate data

It works in various scenarios such as these. This enables complex data processing and analysis to be performed easily and efficiently.

4. Comparison with Other Window Functions

MySQL 8.0 provides, in addition to ROW_NUMBER(), window functions such as RANK() and DENSE_RANK() for ranking and position calculations. While these functions serve similar purposes, they differ in behavior and results. Here we compare each function and explain appropriate use cases.

4-1. RANK() Function

RANK() function assigns rankings, but its characteristic is that it gives the same rank to identical values and skips the next rank.

Basic Syntax

SELECT
    column_name,
    RANK() OVER (PARTITION BY group_column ORDER BY sort_column) AS rank
FROM
    table_name;

Example

We use the following data to calculate sales rankings.

employee	department	sale
A	Sales	800
B	Sales	800
C	Sales	600
D	Sales	500

Query example: Using RANK()

SELECT
    employee,
    sale,
    RANK() OVER (ORDER BY sale DESC) AS rank
FROM
    sales;

Result:

employee	sale	rank
A	800	1
B	800	1
C	600	3
D	500	4

Features:

A and B, which have the same sales amount (800), are treated as rank 1.
The next rank, 2, is skipped, so C becomes rank 3.

4-2. DENSE_RANK() Function

DENSE_RANK() function also assigns the same rank to identical values like RANK(), but it does not skip the next rank.

Basic Syntax

SELECT
    column_name,
    DENSE_RANK() OVER (PARTITION BY group_column ORDER BY sort_column) AS rank
FROM
    table_name;

Example

We use the same data as before to try the DENSE_RANK() function. Query example: Using DENSE_RANK()

SELECT
    employee,
    sale,
    DENSE_RANK() OVER (ORDER BY sale DESC) AS dense_rank
FROM
    sales;

Result:

employee	sale	dense_rank
A	800	1
B	800	1
C	600	2
D	500	3

Features:

A and B, with the same sales amount (800), are treated as rank 1.
Unlike RANK(), the next rank starts at 2, preserving rank continuity.

4-3. Difference from ROW_NUMBER() Function

ROW_NUMBER() function differs from the other two functions in that it assigns a unique number even when values are the same.

Example

SELECT
    employee,
    sale,
    ROW_NUMBER() OVER (ORDER BY sale DESC) AS row_num
FROM
    sales;

Result:

employee	sale	row_num
A	800	1
B	800	2
C	600	3
D	500	4

Features:

Even when values are identical, a unique number is assigned, so there are no duplicate ranks.
It is suitable when you need strict control over data order or require row-level uniqueness.

4-4. Summary of Use Cases for Each Function

Function	Ranking Behavior	Use Case
ROW_NUMBER()	Assigns a unique number	When adding sequential numbers to data or when a unique identifier is needed
RANK()	Assigns the same number to ties and skips the next number	When you want to display ranking positions as they are with ties
DENSE_RANK()	Assigns the same number to ties without skipping numbers	When continuity of ranking is important

Summary

ROW_NUMBER(), RANK(), DENSE_RANK() each need to be used in different scenarios.

ROW_NUMBER() is suitable when a unique number is needed.
RANK() is useful when you want to assign the same rank to identical values while emphasizing gaps in ranking.
DENSE_RANK() is appropriate for scenarios that require continuous ranking without gaps.

5. Alternatives for MySQL versions below 8.0

MySQL versions prior to 8.0 do not support the ROW_NUMBER() function or other window functions. However, by leveraging user variables, you can achieve similar functionality. This section explains concrete alternatives for MySQL versions below 8.0.

5-1. Sequential numbering using user variables

In environments with MySQL 5.7 or earlier, you can assign a sequential number to each row using user variables. See the example below.

Example: Sales ranking by department

Sample data:

employee	department	sale
A	Sales	500
B	Sales	800
C	Development	600
D	Development	700

Query:

SET @row_num = 0;
SET @dept = '';

SELECT
    employee,
    department,
    sale,
    @row_num := IF(@dept = department, @row_num + 1, 1) AS rank,
    @dept := department
FROM
    (SELECT * FROM sales ORDER BY department, sale DESC) AS sorted_sales;

Result:

employee	department	sale	rank
B	Sales	800	1
A	Sales	500	2
D	Development	700	1
C	Development	600	2

5-2. Extracting the top N rows

To retrieve the top N rows, you can similarly use user variables. Query:

SET @row_num = 0;
SET @dept = '';

SELECT *
FROM (
    SELECT
        employee,
        department,
        sale,
        @row_num := IF(@dept = department, @row_num + 1, 1) AS rank,
        @dept := department
    FROM
        (SELECT * FROM sales ORDER BY department, sale DESC) AS sorted_sales
) AS ranked_sales
WHERE rank <= 3;

Result:

employee	department	sale	rank
B	Sales	800	1
A	Sales	500	2
D	Development	700	1
C	Development	600	2

This query assigns a rank per department and then extracts only the rows within the top three.

5-3. Detecting and removing duplicate rows

Duplicate data handling can also be addressed using user variables. Query example: Detect duplicate rows

SET @row_num = 0;
SET @id_check = '';

SELECT *
FROM (
    SELECT
        id,
        name,
        @row_num := IF(@id_check = name, @row_num + 1, 1) AS rank,
        @id_check := name
    FROM
        (SELECT * FROM customers ORDER BY name, id) AS sorted_customers
) AS tmp
WHERE rank > 1;

Query example: Delete duplicate rows

DELETE FROM customers
WHERE id IN (
    SELECT id
    FROM (
        SELECT
            id,
            @row_num := IF(@id_check = name, @row_num + 1, 1) AS rank,
            @id_check := name
        FROM
            (SELECT * FROM customers ORDER BY name, id) AS sorted_customers
    ) AS tmp
    WHERE rank > 1
);

5-4. Cautions when using user variables

Session dependency

User variables are only valid within the session. They cannot be used in other queries or sessions.

Dependency on execution order

User variables depend on the query execution order, so setting the ORDER BY clause correctly is crucial.

Readability and maintainability of SQL

Because the code can become complex, using window functions is recommended for MySQL 8.0 and later.

Summary

In MySQL versions prior to 8.0, you can use user variables to achieve sequential numbering and ranking since window functions are unavailable. However, because queries can become complex, migrating to a newer version is advisable when possible.

6. Points to Note and Best Practices

MySQL’s ROW_NUMBER() function and alternative methods that use user variables are very handy, but there are points to watch to operate them accurately and efficiently. Here we explain in detail the cautions when using them and best practices for performance optimization.

6-1. Performance Considerations

1. ORDER BY Overhead

ROW_NUMBER() function must always be used together with ORDER BY. This operation sorts the data, so processing time increases with large data sets. Countermeasure:

Use indexes: Set indexes on columns used in ORDER BY to speed up sorting.
Use LIMIT: Retrieve only the needed number of rows to reduce the amount of data processed.

Example:

SELECT
    employee,
    sale,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
    sales
LIMIT 1000;

2. Increased Memory Usage and Disk I/O

Window functions use temporary tables and memory, so as data volume grows, memory consumption and disk I/O increase. Countermeasure:

Split queries: Break the work into smaller queries and extract data step by step to reduce load.
Use temporary tables: Store extracted data in a temporary table and perform aggregation from there to distribute the load.

6-2. Query Tuning Points

1. Checking the Execution Plan

In MySQL you can use the EXPLAIN statement to view a query’s execution plan. This lets you verify whether indexes are being used correctly. Example:

EXPLAIN
SELECT
    employee,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
    sales;

Sample Output:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	sales	index	NULL	sale	4	NULL	500	Using index

Thus, if Using index is shown, it indicates that the index is being used appropriately.

2. Index Optimization

Make sure to set indexes on columns used in ORDER BY and WHERE clauses. Pay particular attention to the following points.

Single-column index: Applied for simple sort conditions
Composite index: Effective when multiple columns are involved in conditions

Example:

CREATE INDEX idx_department_sale ON sales(department, sale DESC);

3. Leveraging Batch Processing

Instead of processing a huge amount of data at once, you can distribute load by using batch processing to handle data in smaller chunks. Example:

SELECT * FROM sales WHERE department = 'Sales Department' LIMIT 1000 OFFSET 0;
SELECT * FROM sales WHERE department = 'Sales Department' LIMIT 1000 OFFSET 1000;

6-3. Maintaining Data Consistency

1. Data Updates and Recalculation

Adding or deleting data can cause numbering to become out of sync. Therefore, set up a mechanism to recalculate the numbered data as needed. Example:

CREATE VIEW ranked_sales AS
SELECT
    employee,
    sale,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
FROM
    sales;

By using a view, you can always maintain a ranking based on the latest data.

6-4. Query Example Best Practices

Below is an example of best practices that consider performance and maintainability. Example: Extract Top N Rows

WITH RankedSales AS (
    SELECT
        employee,
        department,
        sale,
        ROW_NUMBER() OVER (PARTITION BY department ORDER BY sale DESC) AS rank
    FROM
        sales
)
SELECT *
FROM RankedSales
WHERE rank <= 3;

This structure uses a Common Table Expression (CTE) to improve code readability and reusability.

Summary

When using the ROW_NUMBER() function or its alternatives, it is important to keep the following points in mind.

Optimize indexes to improve processing speed.
Check the execution plan to identify performance bottlenecks.
Address data updates and implement mechanisms to continuously maintain consistency.
Leverage batch processing and CTEs to distribute load.

By applying these best practices, you can efficiently perform large-scale data analysis and reporting.

7. Summary

In the previous articles, we focused on MySQL’s ROW_NUMBER() function, covering its basic usage, advanced examples, alternative methods for older versions, and cautions and best practices in detail. In this section, we review the key points of the article and summarize practical usage tips.

7-1. Convenience of the ROW_NUMBER() Function

ROW_NUMBER() function is extremely useful in data analysis and report generation for the following reasons.

Sequential numbering per group: Easily achieve sales rankings by department or rankings by category.
Extracting top N rows: Efficiently retrieve and filter data based on specific criteria.
Detecting and removing duplicate data: Helpful for data organization and cleaning tasks.

These capabilities allow you to write complex queries more simply, greatly improving SQL readability and maintainability.

7-2. Comparison with Other Window Functions

Compared with window functions such as RANK() and DENSE_RANK(), ROW_NUMBER() differs in that it assigns a unique number to each row, even when values are the same.

Function	Feature	Use case
ROW_NUMBER()	Assigns a unique sequential number to each row	Ideal when you need to identify data or rank without allowing duplicates
RANK()	Assigns the same rank to ties and skips subsequent ranks	Useful when ranking with ties and gaps between ranks matter
DENSE_RANK()	Assigns the same rank to ties but does not skip ranks	Ranking that considers ties while preserving rank continuity

Choosing the appropriate function: Selecting the right function for your use case enables efficient data processing.

7-3. Strategies for Older Versions

In environments below MySQL 8.0, we also showed how you can use user-defined variables to achieve similar functionality. However, these approaches come with the following considerations.

Reduced readability due to increased code complexity
Optimization of query execution can be difficult
Additional processing is required to maintain data integrity

If possible, we strongly recommend upgrading to MySQL 8.0 or later and using window functions.

7-4. Performance Optimization Tips

Leverage indexes: Create indexes on columns used in ORDER BY to improve processing speed.
Review execution plans: Use the EXPLAIN statement to pre‑validate performance.
Adopt batch processing: Break large datasets into smaller chunks to distribute load.
Utilize views and CTEs: Increase code reusability and simplify complex queries.

Applying these optimization techniques enables efficient and stable data processing.

7-5. Conclusion

ROW_NUMBER() function is a powerful tool that dramatically improves data analysis efficiency. This article covered everything from basic syntax to advanced examples, cautions, and alternative methods. We encourage readers to try out the queries in practice as you follow along. By improving your SQL skills, you’ll gain confidence tackling more complex data analyses and report generation.

Appendix: Reference Resources

Official documentation: MySQL Window Functions
SQL execution environment: SQL Fiddle (an online tool for running and testing SQL)

MySQL ROW_NUMBER Function: Guide from Basics to Alternatives

1. Introduction

Target Audience

Benefits of the ROW_NUMBER() Function

2. What is the ROW_NUMBER function

Basic Syntax of the ROW_NUMBER Function

Meaning of Each Element

Basic Example

Result

How to Use PARTITION BY

Result

Features and Considerations of the ROW_NUMBER() Function

3. Practical Use Cases

3-1. Ranking Within Groups

3-2. Extracting Top N Records

3-3. Extracting and Deleting Duplicate Data

Summary

4. Comparison with Other Window Functions

4-1. RANK() Function

Basic Syntax

Example

4-2. DENSE_RANK() Function

Basic Syntax

Example

4-3. Difference from ROW_NUMBER() Function

Example

4-4. Summary of Use Cases for Each Function

Summary

5. Alternatives for MySQL versions below 8.0

5-1. Sequential numbering using user variables

Example: Sales ranking by department

5-2. Extracting the top N rows

5-3. Detecting and removing duplicate rows

5-4. Cautions when using user variables

Summary

6. Points to Note and Best Practices

6-1. Performance Considerations

1. ORDER BY Overhead

2. Increased Memory Usage and Disk I/O

6-2. Query Tuning Points

1. Checking the Execution Plan

2. Index Optimization

3. Leveraging Batch Processing

6-3. Maintaining Data Consistency

1. Data Updates and Recalculation

6-4. Query Example Best Practices

Summary

7. Summary

7-1. Convenience of the ROW_NUMBER() Function

7-2. Comparison with Other Window Functions

7-3. Strategies for Older Versions

7-4. Performance Optimization Tips

7-5. Conclusion

Appendix: Reference Resources

MySQL Option Settings: Complete Guide from Basics to Fixes

Beginner’s Guide: Quick MySQL Table View (CLI & GUI)