MySQL WITH Clause: From Beginner Basics to Practical Use

目次

1. Introduction

MySQL is a database management system used by many developers and database administrators, offering powerful and flexible SQL capabilities. Among them, the WITH clause (Common Table Expression: CTE) is a powerful tool that makes SQL queries more readable and improves maintainability. In this article, we will thoroughly explain the basics to advanced usage of this WITH clause, targeting beginners to intermediate users. In particular, we will cover practical topics such as Replacing subqueries and Implementing recursive queries. This article aims to provide concrete solutions for those learning SQL or struggling with query optimization in daily work. Follow the outline below to understand the fundamentals of the WITH clause and apply them in practice.

2. Basics of the WITH Clause (Common Table Expressions)

What is the WITH clause?

WITH clause is a syntax that defines a temporary result set (common table expression, CTE) within an SQL query and allows it to be used in subsequent queries. It has been supported since MySQL 8.0 and can replace complex subqueries with a concise and readable form. For example, writing subqueries directly can reduce readability and make the overall query longer. Using the WITH clause lets you split the query into logical blocks, making it easier to understand.

Basic Syntax of the WITH Clause

Below is the basic syntax of the WITH clause.
WITH table_name AS (
  SELECT col1, col2
  FROM source_table
  WHERE condition
)
SELECT col1, col2
FROM table_name;
In this syntax, a virtual table (common table expression) is defined after WITH and used in the main query. This allows repeatedly used subqueries to be expressed concisely.

Differences from Subqueries and Views

The WITH clause creates a temporarily available result set and differs from subqueries and views in several ways.
FeatureWITH ClauseSubqueryView
ScopeValid only within the queryUsable only where definedReusable across the entire database
TemporalityTemporaryTemporaryPersistent
PurposeSimplify complex queriesTemporary data extractionData extraction for frequent reuse
The WITH clause offers higher readability than subqueries and is ideal when you don’t need to create a permanent object like a view.

Benefits of Using the WITH Clause

  1. Improved query readability Even with multiple subqueries, organizing them with the WITH clause clarifies the structure.
  2. Enhanced reusability Defining a temporary result set allows you to reference it multiple times within the query.
  3. Improved maintainability Logical division of the query makes modifications and extensions easier.

3. Basic Usage of MySQL WITH Clause

Replacing Subqueries

The WITH clause is a powerful tool for simplifying complex subqueries. Embedding subqueries directly can make the entire query complicated and hard to read, but using WITH improves readability. Below is a basic example of replacing a subquery with a WITH clause. When using a subquery:
SELECT AVG(sales.total) AS average_sales
FROM (
  SELECT SUM(amount) AS total
  FROM orders
  GROUP BY customer_id
) AS sales;
When using WITH:
WITH sales AS (
  SELECT SUM(amount) AS total
  FROM orders
  GROUP BY customer_id
)
SELECT AVG(sales.total) AS average_sales
FROM sales;
In this example, the temporary result set sales is defined with a WITH clause and used in the main query. This makes the overall query clearer and more organized.

Defining Multiple Common Table Expressions (CTEs)

The WITH clause allows you to define multiple CTEs, enabling further modularization of complex queries. Example:
WITH 
  sales_per_customer AS (
    SELECT customer_id, SUM(amount) AS total_sales
    FROM orders
    GROUP BY customer_id
  ),
  high_value_customers AS (
    SELECT customer_id
    FROM sales_per_customer
    WHERE total_sales > 10000
  )
SELECT customer_id
FROM high_value_customers;
In this example, sales_per_customer calculates each customer’s total sales, and based on that, high_value_customers extracts high‑spending customers. Using multiple CTEs sequentially lets you build the query step by step.

Using Nested CTEs

Nested CTEs allow you to perform even more complex data manipulations. Example:
WITH 
  sales_data AS (
    SELECT product_id, SUM(amount) AS total_sales
    FROM orders
    GROUP BY product_id
  ),
  ranked_sales AS (
    SELECT product_id, total_sales,
           RANK() OVER (ORDER BY total_sales DESC) AS rank
    FROM sales_data
  )
SELECT product_id, total_sales
FROM ranked_sales
WHERE rank <= 5;
In this query, sales_data aggregates sales for each product, and ranked_sales ranks them by sales. Finally, it extracts the top 5 products.

Key Points for Practical Use

  1. Break the query into parts Building CTEs step by step makes the query easier to read and simplifies debugging.
  2. Store intermediate results Consolidating reusable calculations or filter conditions into CTEs reduces code duplication.
  3. Considerations for large datasets Since CTEs generate temporary result sets, you need to account for performance impacts when handling large volumes of data.

4. Recursive WITH Clause Use Cases

What Is a Recursive WITH Clause?

A recursive WITH clause (recursive CTE) uses a common table expression to repeatedly execute a self‑referencing query, handling hierarchical data and iterative calculations. Recursive CTEs are supported in MySQL 8.0 and later, and are especially useful when working with parent‑child relationships or hierarchical structures.

Basic Syntax of Recursive CTE

When defining a recursive CTE, use the WITH RECURSIVE keyword. The basic syntax is as follows:
WITH RECURSIVE recursive_table_name AS (
  initial_query -- starting point of recursion
  UNION ALL
  recursive_query -- query called recursively
)
SELECT * FROM recursive_table_name;
  • Initial query: Retrieves the first dataset for the recursive process.
  • Recursive query: Generates new data based on the initial query or the previous result.
  • UNION ALL: Combines the results of the initial query and the recursive query.

Example: Processing Hierarchical Data

Recursive CTEs are used to expand data with hierarchical structures (e.g., organization trees or category trees). Example: Expanding an Employee Management Hierarchy Consider the following employees table as an example:
employee_idnamemanager_id
1AliceNULL
2Bob1
3Charlie1
4David2
Using this data, we create a query that retrieves the entire hierarchy starting from a given employee.
WITH RECURSIVE employee_hierarchy AS (
  -- Initial query: Retrieve top-level employees
  SELECT employee_id, name, manager_id, 1 AS level
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  -- Recursive query: Retrieve direct subordinates
  SELECT e.employee_id, e.name, e.manager_id, eh.level + 1
  FROM employees e
  INNER JOIN employee_hierarchy eh
  ON e.manager_id = eh.employee_id
)
SELECT * FROM employee_hierarchy;
Result:
employee_idnamemanager_idlevel
1AliceNULL1
2Bob12
3Charlie12
4David23
This query recursively searches for subordinates based on manager_id, expanding the entire hierarchy.

Limitations and Considerations for Recursive CTEs

  1. Recursive termination condition required If the recursive query does not meet a termination condition, it may cause an infinite loop. Set appropriate conditions to prevent infinite loops.
  2. Impact on performance Recursive CTEs involve extensive computation on large datasets, which can increase query execution time. Use LIMIT clauses and filtering conditions to improve efficiency.
  3. Recursive depth limit MySQL imposes a limit on recursion depth, so caution is needed for very deep recursive operations. This limit can be configured with the max_recursive_iterations parameter.

Scenarios for Using Recursive CTEs

  • Traversing folder structures: Recursively search folders and subfolders.
  • Creating org charts: Visualize the hierarchy from managers to subordinates.
  • Displaying category trees: Retrieve hierarchies of product categories or tags.
Recursive CTEs provide a powerful way to write concise SQL queries in these scenarios, improving readability.

5. Cautions When Using WITH Clause

Impact on Performance and Optimization

  1. Recalculating CTEs A CTE defined with a WITH clause is generally recomputed each time the query references it. Therefore, using the same CTE multiple times can increase query execution time. Example:
   WITH sales AS (
     SELECT product_id, SUM(amount) AS total_sales
     FROM orders
     GROUP BY product_id
   )
   SELECT * FROM sales WHERE total_sales > 1000;
   SELECT COUNT(*) FROM sales;
In the above case, sales is referenced twice, so it is computed twice. To avoid this, storing the result in a temporary table is effective when referencing it multiple times. Solution:
   CREATE TEMPORARY TABLE temp_sales AS
   SELECT product_id, SUM(amount) AS total_sales
   FROM orders
   GROUP BY product_id;

   SELECT * FROM temp_sales WHERE total_sales > 1000;
   SELECT COUNT(*) FROM temp_sales;
  1. Splitting Complex CTEs If the WITH clause is nested too deeply, the entire query can become complex and difficult to debug. It is important to split the logic appropriately so that processing within a single CTE does not become overly complicated.

Using WITH with Large Datasets

The WITH clause generates a temporary dataset during execution. When handling large amounts of data, this can strain memory and storage. Mitigation Strategies:
  • Filter Data with WHERE Clause Filtering out unnecessary data within the CTE can reduce the amount of computation.
  WITH filtered_orders AS (
    SELECT *
    FROM orders
    WHERE order_date > '2023-01-01'
  )
  SELECT customer_id, SUM(amount)
  FROM filtered_orders
  GROUP BY customer_id;
  • Utilize LIMIT Clause When the data volume is large, use the LIMIT clause to extract only the needed rows.

MySQL Version Compatibility

MySQL’s WITH clause is supported in MySQL 8.0 and later. For earlier versions, you need to consider alternatives because the WITH clause cannot be used. Alternatives:
  • Use Subqueries Use subqueries directly instead of a WITH clause.
  SELECT AVG(total_sales)
  FROM (
    SELECT customer_id, SUM(amount) AS total_sales
    FROM orders
    GROUP BY customer_id
  ) AS sales;
  • Create Views If you need a reusable query, creating a view is also effective.
  CREATE VIEW sales_view AS
  SELECT customer_id, SUM(amount) AS total_sales
  FROM orders
  GROUP BY customer_id;

  SELECT AVG(total_sales) FROM sales_view;

Proper Use of WITH Clause

  1. Prioritize Readability The purpose of using a WITH clause is to organize queries and improve readability. Overusing it can actually make queries more complex, so use it only when needed.
  2. Validate Performance Check the execution plan (using the EXPLAIN command) and consider ways to optimize performance.
   EXPLAIN
   WITH sales AS (
     SELECT product_id, SUM(amount) AS total_sales
     FROM orders
     GROUP BY product_id
   )
   SELECT * FROM sales WHERE total_sales > 1000;

6. Real-World Application Examples

Aggregating Sales Data

This example aggregates sales data by month and then uses the results to calculate the monthly average sales. Example: Monthly Sales Aggregation and Average Calculation
WITH monthly_sales AS (
  SELECT 
    DATE_FORMAT(order_date, '%Y-%m') AS sales_month,
    SUM(amount) AS total_sales
  FROM orders
  GROUP BY sales_month
)
SELECT 
  sales_month, 
  total_sales,
  AVG(total_sales) OVER () AS average_sales
FROM monthly_sales;
In this query, monthly_sales calculates sales for each month, and then the overall average sales are computed from that. This organizes the data clearly and makes analysis easy.

Filtering Data Based on Specific Conditions

By separating complex condition filtering into WITH clauses, readability can be improved. Example: Creating a List of High-Value Purchasers
WITH customer_totals AS (
  SELECT 
    customer_id, 
    SUM(amount) AS total_spent
  FROM orders
  GROUP BY customer_id
)
SELECT 
  customer_id, 
  total_spent
FROM customer_totals
WHERE total_spent > 100000;
In this query, customer_totals calculates the total spend per customer, and then extracts the high-value purchasers that meet the condition.

Analyzing Hierarchical Data

When analyzing hierarchical data such as organizations or categories, recursive WITH clauses are extremely useful. Example: Retrieving a List of Direct Reports Within an Organization
WITH RECURSIVE employee_hierarchy AS (
  SELECT 
    employee_id, 
    name, 
    manager_id, 
    1 AS level
  FROM employees
  WHERE manager_id IS NULL

  UNION ALL

  SELECT 
    e.employee_id, 
    e.name, 
    e.manager_id, 
    eh.level + 1
  FROM employees e
  INNER JOIN employee_hierarchy eh
  ON e.manager_id = eh.employee_id
)
SELECT 
  employee_id, 
  name, 
  manager_id, 
  level
FROM employee_hierarchy
ORDER BY level, manager_id;
In this query, employee_hierarchy builds the hierarchical data and retrieves a list by employee level. This enables dynamic generation of information like an org chart.

Analysis Using Multiple Common Table Expressions

By leveraging multiple WITH clauses, you can process data stepwise and achieve complex analysis in a simple way. Example: Extracting Top-Selling Products by Category
WITH category_sales AS (
  SELECT 
    category_id, 
    product_id, 
    SUM(amount) AS total_sales
  FROM orders
  GROUP BY category_id, product_id
),
ranked_sales AS (
  SELECT 
    category_id, 
    product_id, 
    total_sales,
    RANK() OVER (PARTITION BY category_id ORDER BY total_sales DESC) AS rank
  FROM category_sales
)
SELECT 
  category_id, 
  product_id, 
  total_sales
FROM ranked_sales
WHERE rank <= 3;
In this query, sales are calculated per category, and the top three products are extracted. This is effective for narrowing data based on specific criteria.

Key Points for Real-World Applications

  1. Query Partitioning Design Using WITH clauses to split queries and process data stepwise allows complex analysis while maintaining readability.
  2. Extract Only Necessary Data Leveraging WHERE and LIMIT clauses to avoid processing unnecessary data enables efficient query design.
  3. Flexible Use in Business It can be flexibly applied to sales analysis, customer segmentation, inventory management, and other business needs.

7. FAQ (Frequently Asked Questions)

Q1: When should the WITH clause be used?

A1: The WITH clause is especially useful in the following scenarios:
  • When you want to write complex subqueries concisely.
  • When the same dataset is reused across multiple queries.
  • When you want to logically split a query to improve readability.
For example, in queries that use the same aggregated results multiple times, using a WITH clause can process efficiently.

Q2: When are recursive CTEs useful?

A2: Recursive CTEs are helpful when hierarchical structures or iterative calculations are needed. Specifically, they are used in scenarios such as:
  • Processing hierarchical data (e.g., expanding an organization tree or category hierarchy).
  • Displaying folder or file hierarchies.
  • Sequential calculations of numbers or periods (e.g., computing the Fibonacci sequence).
Using a recursive CTE allows you to easily expand and process self-referential data.

Q3: Are queries using the WITH clause more efficient than views?

A3: It depends on the case.
  • WITH clause creates a temporary result set that is used only within that query. It is suitable for data that does not need frequent reuse.
  • Views are stored persistently in the database and can be reused by other queries. They are suitable for queries that are used repeatedly.
Choosing between them as appropriate is important.

Q4: What causes performance degradation when using the WITH clause?

A4: The main reasons for performance degradation when using a WITH clause are:
  • Recalculation of the CTE: The result of the WITH clause is recomputed each time it is referenced, increasing processing time.
  • Processing large volumes of data: Generating large amounts of data within the WITH clause increases memory usage and degrades performance.
  • Lack of appropriate indexes: If the queries inside the WITH clause lack proper indexes, execution speed may suffer.
Mitigation:
  • Consider using temporary tables or views when reuse frequency is high.
  • Use WHERE and LIMIT clauses to appropriately filter data.

Q5: Are there alternatives for MySQL versions that do not support the WITH clause?

A5: Since MySQL versions prior to 8.0 do not support the WITH clause, the following alternatives are used:
  • Using subqueries Instead of a WITH clause, use subqueries directly.
  SELECT AVG(total_sales)
  FROM (
    SELECT customer_id, SUM(amount) AS total_sales
    FROM orders
    GROUP BY customer_id
  ) AS sales;
  • Using temporary tables Store the reusable dataset in a temporary table.
  CREATE TEMPORARY TABLE temp_sales AS
  SELECT customer_id, SUM(amount) AS total_sales
  FROM orders
  GROUP BY customer
  GROUP BY customer_id;

  SELECT AVG(total_sales) FROM temp_sales;

Q6: What are the best practices when using the WITH clause?

A6: When using the WITH clause, keep the following best practices in mind:
  • Prioritize simplicity: Complex

8. Summary

In this article, we provided a comprehensive overview of the WITH clause (common table expression, CTE) introduced in MySQL 8.0, covering everything from basics to advanced usage. The WITH clause is an extremely useful feature for making complex queries concise and readable. Below are the key points from this article.

Key Benefits of the WITH Clause

  1. Improved Query Readability By organizing complex subqueries concisely, it enhances the readability and maintainability of SQL code.
  2. Query Reusability It allows efficient processing when the same dataset is referenced multiple times.
  3. Supports Recursive Data Operations By leveraging recursive CTEs, you can handle hierarchical data and iterative calculations in a straightforward way.

Practical Application Points

  • Sales and Customer Data Analysis is facilitated, allowing results to be aggregated in stages.
  • Hierarchical Data Processing (such as org charts or category structures) can effectively use recursive CTEs.
  • By combining with views or temporary tables as needed, you can achieve flexible and efficient database operations.

Cautions When Using

  • The WITH clause is convenient, but improper use can degrade performance.
  • Consider reusability and performance, and evaluate on a case-by-case basis whether to use views or temporary tables.
  • It’s important to review the execution plan (EXPLAIN command) to verify query efficiency.

Next Steps

By using the WITH clause, you can create more efficient and maintainable SQL queries. Give it a try in your real projects. Let’s put it into practice with the following steps:
  1. Start with simple queries and practice structuring them using the WITH clause.
  2. Use recursive CTEs to tackle hierarchical data and complex scenarios.
  3. Focus on performance optimization to further enhance your SQL skills.
That concludes the content of this article. Please apply your knowledge of MySQL’s WITH clause to your daily work and learning.