Using MySQL COUNT(DISTINCT) to Get Unique Row Counts

目次

1. Introduction

When operating a database, you may encounter situations such as “How many different countries are registered?” or “I want to know the number of unique email addresses.” In such cases, you can use MySQL’s COUNT(DISTINCT column_name) to obtain the data count while eliminating duplicates. In this article, we will explain the following topics in detail.
  • COUNT() and DISTINCT basics
  • Proper usage of COUNT(DISTINCT column_name)
  • How to count unique data across multiple columns
  • Ways to improve the performance of COUNT(DISTINCT)
To make it easy for beginners to understand, we will explain with concrete examples and SQL queries, so please read through to the end.

2. Basics of Counting Data in MySQL (COUNT)

When analyzing information in a database, the most fundamental tool is the COUNT() function. First, let’s understand the basic behavior of COUNT().

2.1 Difference between COUNT(*) and COUNT(column_name)

The MySQL COUNT() function has the following two usages.
COUNT functionDescription
COUNT(*)All rows in the table are counted (including NULL)
COUNT(column_name)Non-NULL values of a specific column are counted

2.2 Basic Example of COUNT()

Here, we will explain using the following users table as an example.
idnameemailcountry
1Tarotaro@example.comJapan
2Hanakohanako@example.comJapan
3JohnNULLUSA
4Tanakatanaka@example.comJapan

① Get the total number of rows in the table

SELECT COUNT(*) FROM users;
→ Result: 4 (total number of rows)

② Get the count of a specific column excluding NULL

SELECT COUNT(email) FROM users;
→ Result: 3 (count of email excluding NULL) 💡 Key point:
  • COUNT(*) retrieves the total number of rows including NULL.
  • COUNT(email) counts excluding NULL.

3. Retrieve Data Excluding Duplicates (DISTINCT)

When aggregating data, you often want to retrieve only unique values. In such cases, DISTINCT is useful.

3.1 Basics of DISTINCT

DISTINCT is used to eliminate duplicate data in the specified column and retrieve the result.

Basic syntax

SELECT DISTINCT column_name FROM table_name;

3.2 Example usage of DISTINCT

Running the following SQL query retrieves a list of unique country names registered by users.
SELECT DISTINCT country FROM users;
→ Result:
country
Japan
America

3.3 Difference between DISTINCT and GROUP BY

FeatureDISTINCTGROUP BY
PurposeRetrieve unique valuesPerform aggregation per group
UsageSELECT DISTINCT column_nameSELECT column_name, COUNT(*) GROUP BY column_name
ExampleRetrieve unique countriesCount users per country
💡 Point:
  • DISTINCT simply removes duplicate data.
  • GROUP BY groups data and is used together with aggregate functions.

4. How to use COUNT(DISTINCT column_name)

COUNT(DISTINCT column_name) can be used to retrieve the number of unique values.

4.1 Basics of COUNT(DISTINCT)

Basic syntax

SELECT COUNT(DISTINCT column_name) FROM table_name;

4.2 Example of COUNT(DISTINCT) usage

SELECT COUNT(DISTINCT country) FROM users;
→ Result: 2 (\”Japan\” and \”America\”)

4.3 Using COUNT(DISTINCT) with a condition

SELECT COUNT(DISTINCT email) FROM users WHERE country = 'Japan';
→ Result: 2 (the number of unique email addresses registered in Japan) 💡 Key points:
  • COUNT(DISTINCT column_name) retrieves the count of unique data while excluding NULL values.
  • By using a WHERE clause, you can count the number of rows that meet specific conditions.

5. Using COUNT(DISTINCT) with Multiple Columns

In MySQL, directly using COUNT(DISTINCT column1, column2) is not allowed. Instead, you can use CONCAT() to combine columns and treat them as a single value.

5.1 Why COUNT(DISTINCT column1, column2) Cannot Be Used

In MySQL, you cannot apply COUNT(DISTINCT) directly to multiple columns as in COUNT(DISTINCT column1, column2). This is a MySQL limitation.

5.2 How to Count Unique Values Across Multiple Columns

To obtain the number of unique data across combinations of multiple columns, the common approach is to concatenate the columns using CONCAT() and apply COUNT(DISTINCT) to the result.

Example: Getting Unique Counts for Country and City Combinations

SELECT COUNT(DISTINCT CONCAT(country, '-', city)) FROM users;
💡 Key Points:
  • CONCAT(column1, '-', column2) allows you to combine multiple columns to create a unique value.
  • COUNT(DISTINCT CONCAT(...)) allows you to retrieve unique data for each combination of multiple columns.

6. COUNT(DISTINCT) Performance Tuning

COUNT(DISTINCT) can affect performance, so optimization is necessary. When used on large datasets, you should consider leveraging indexes or alternative methods.

6.1 Reasons COUNT(DISTINCT) Becomes Slow

  • MySQL often scans all records to apply DISTINCT.
  • If indexes are not properly set, the processing speed slows down.
  • If there is a lot of duplicate data, the computational load increases.

6.2 Index Optimization for Accelerating COUNT(DISTINCT)

When the data volume is large, you can improve search speed by adding an index to the target column.

How to Add an Index

ALTER TABLE users ADD INDEX (country);

Check the Query Execution Plan Using the Index

EXPLAIN SELECT COUNT(DISTINCT country) FROM users;
💡 Key Point:
  • Using EXPLAIN lets you see how MySQL processes the query.
  • Applying an index can avoid full table scans and potentially speed up searches.

6.3 Alternative to GROUP BY + COUNT

Depending on the data aggregation, using GROUP BY can be faster.

Example: Counting Unique Data with GROUP BY

SELECT country, COUNT(*) FROM users GROUP BY country;
💡 Key Point:
  • GROUP BY can offer better performance compared to COUNT(DISTINCT).
  • Effective when you want to group and aggregate data simultaneously.

7. Common Errors and Solutions for COUNT(DISTINCT)

When using COUNT(DISTINCT), several common errors can occur. Here we introduce typical errors and their solutions.

7.1 Error 1: COUNT(DISTINCT column1, column2) cannot be used

Cause of the error

In MySQL, the COUNT(DISTINCT column1, column2) targeting multiple columns is not supported. Using this syntax directly results in an error.

Solution: Use CONCAT()

You can avoid the error by concatenating multiple columns and applying COUNT(DISTINCT) to the result.
SELECT COUNT(DISTINCT CONCAT(country, '-', city)) FROM users;
💡 Key point:
  • By using CONCAT(column1, '-', column2), you can create unique data from multiple columns.
  • With COUNT(DISTINCT CONCAT(...)), you can obtain unique values for each combination.

7.2 Error 2: Does not work as expected when NULL values are present

Cause of the error

  • COUNT(DISTINCT column_name) ignores NULL values, so columns containing NULL may not produce the expected results.

Solution: Use IFNULL()

By replacing NULL with another default value (e.g., '' or 'unknown'), you can count correctly.
SELECT COUNT(DISTINCT IFNULL(email, 'unknown')) FROM users;
💡 Key point:
  • Using IFNULL(column_name, 'default_value') allows you to handle NULL values properly.

7.3 Error 3: COUNT(DISTINCT) is slow

Cause of the error

  • COUNT(DISTINCT) scans all data, so
  • it may become slow on large data sets.

Solution: Leverage indexes

ALTER TABLE users ADD INDEX (country);
💡 Key point:
  • Setting an index can improve query performance.
  • It’s good to use EXPLAIN to check the query’s optimization status.
EXPLAIN SELECT COUNT(DISTINCT country) FROM users;
By applying these measures, you can improve the usefulness of COUNT(DISTINCT) and avoid performance issues.

8. Frequently Asked Questions (FAQ)

Here we have compiled frequently asked questions about COUNT(DISTINCT).

8.1 What is the difference between COUNT(*) and COUNT(DISTINCT column_name)?

Key points

FunctionDescription
COUNT(*)Count all records (including NULL)
COUNT(DISTINCT column_name)Count unique values (excluding NULL)

Example

SELECT COUNT(*) FROM users;
SELECT COUNT(DISTINCT email) FROM users;
💡 Key point:
  • COUNT(*) counts all records.
  • COUNT(DISTINCT column_name) retrieves the number of unique values (excluding NULL).

8.2 What is the difference between DISTINCT and GROUP BY?

FeatureDISTINCTGROUP BY
PurposeRetrieve unique valuesPerform aggregation per group
UsageSELECT DISTINCT column_nameSELECT column_name, COUNT(*) GROUP BY column_name
ExampleGet unique countriesCount users per country

Example

-- Use DISTINCT
SELECT DISTINCT country FROM users;

-- Use GROUP BY
SELECT country, COUNT(*) FROM users GROUP BY country;
💡 Key point:
  • DISTINCT simply removes duplicate data.
  • GROUP BY groups data and can be combined with aggregate functions.

8.3 Is COUNT(DISTINCT) slow?

Issue

  • COUNT(DISTINCT) scans all data, so with large data volumes it can become slow.

Solution: Use indexes

ALTER TABLE users ADD INDEX (country);

Alternative method: Use GROUP BY

SELECT country, COUNT(*) FROM users GROUP BY country;
💡 Key point:
  • Applying indexes can improve query speed.
  • GROUP BY can sometimes yield faster results than COUNT(DISTINCT).

8.4 How to use COUNT(DISTINCT column1, column2)?

Issue

  • In MySQL, COUNT(DISTINCT column1, column2) is not supported.

Solution: Use CONCAT()

SELECT COUNT(DISTINCT CONCAT(country, '-', city)) FROM users;
💡 Key point:
  • CONCAT(column1, '-', column2) allows creating unique data from multiple columns.
  • With COUNT(DISTINCT CONCAT(...)) you can obtain unique values per combination.
By referring to these questions, you can use COUNT(DISTINCT) more efficiently.

9. Summary

In this article, we explained in detail how to use MySQL’s COUNT(DISTINCT). Finally, let’s review the key points of this article.

9.1 What You Learned in This Article

How to Retrieve Data Counts in MySQL
  • COUNT(*) retrieves the total number of rows
  • COUNT(column_name) counts non-NULL values
  • COUNT(DISTINCT column_name) gets the count of unique values
Difference between DISTINCT and COUNT(DISTINCT)
  • DISTINCT returns distinct rows (removes duplicates)
  • COUNT(DISTINCT column_name) counts the number of unique values
How to Use COUNT(DISTINCT) with Multiple Columns
  • Since MySQL doesn’t support COUNT(DISTINCT column1, column2) directly, you can use CONCAT()
Performance Optimization Techniques
  • Apply indexes to improve query speed
  • Using GROUP BY + COUNT enables faster queries

9.2 What You Can Do with This Knowledge

With this knowledge, you can perform data aggregations such as: 🔹 Aggregating unique user counts 🔹 Retrieving row counts based on specific conditions 🔹 Counting unique data across multiple columns 🔹 Optimizing queries on large datasets In the future, when aggregating or optimizing data in MySQL, be sure to refer to this article!