目次
- 1 1. Introduction
- 2 2. Main Causes of Japanese Text Garbling
- 2.1 Why Doesn’t Japanese Display Correctly in MySQL?
- 2.2 Cause 1: Default Character Encoding Remains latin1
- 2.3 Cause 2: Mismatch in Character Encoding Between Client and Server
- 2.4 Cause 3: Inconsistent Settings for Database, Tables, and Columns
- 2.5 Summary: Most Causes Are Due to “Character Encoding Mismatches”
- 3 3. How to Check MySQL’s Character Encoding Settings
- 4 4. How to Configure Settings to Handle Japanese Properly
- 4.1 Say Goodbye to Garbled Text with Proper Settings
- 5 5. Handling Japanese in Docker Environments
- 6 6. Common Issues and Their Solutions
- 6.1 Still Getting Garbled Text After Setup…? There Might Be Remaining Causes
- 6.1.1 Trouble 1: Settings Changes Not Reflected
- 6.1.2 Trouble 2: Japanese Characters Garbled in Terminal or Command Line
- 6.1.3 Trouble 3: Existing Database or Tables Created with latin1
- 6.1.4 Trouble 4: Character Encoding Mismatch on the Application Side (PHP, Python, etc.)
- 6.1.5 Trouble 5: Garbled Characters When Integrating with CSV or Excel
- 6.2 Comprehensive Checklist for Resolving Issues
- 6.1 Still Getting Garbled Text After Setup…? There Might Be Remaining Causes
- 7 7. Summary
- 8 8. Frequently Asked Questions (FAQ)
- 8.1 Resolving Common Questions About MySQL and Japanese Characters
- 8.1.1 Q1. Japanese characters are displayed as “???” in MySQL. What is the cause?
- 8.1.2 Q2. Even after setting utf8mb4 in my.cnf, it is not reflected.
- 8.1.3 Q3. Japanese characters are garbled in an existing table. Can it be fixed?\n
- 8.1.4 Q4. I’m using MySQL in Docker, but Japanese input causes garbled characters.
- 8.1.5 Q5. What is the difference between utf8 and utf8mb4? Which one should I use?
- 8.1.6 Q6. CSV files exported from Excel are garbled. What should I do?
- 8.1 Resolving Common Questions About MySQL and Japanese Characters
1. Introduction
Can’t Handle Japanese Well in MySQL? A Thorough Explanation of the Causes and Solutions
Have you ever experienced issues like “garbled characters” or “???” when handling Japanese in MySQL, which is widely used as a database in web applications and WordPress? Especially for beginners or when using MySQL in local development environments (such as XAMPP or MAMP) or virtual environments like Docker, cases where Japanese doesn’t display correctly are common. This is mainly because MySQL’s character encoding settings are not appropriate. In this article, we explain in an easy-to-understand way how to configure MySQL to handle Japanese correctly, along with common troubles and their solutions. Additionally, it includes practical know-how useful in real-world scenarios, such as settings for Docker environments, my.cnf configurations, and methods to fix existing databases. This content is designed so that a wide range of readers, from beginners to engineers in development environments, can confidently put it into practice, so please read to the end. In the next section, we explain the root cause of “why Japanese text becomes garbled?”.2. Main Causes of Japanese Text Garbling
Why Doesn’t Japanese Display Correctly in MySQL?
If Japanese text in MySQL is displayed as “???” or incomprehensible symbols, the cause is almost certainly a character encoding setting error. MySQL is a very flexible database, but if the character encoding (character set) and collation settings do not match, it cannot store or retrieve data correctly. I have summarized the common causes into three below.Cause 1: Default Character Encoding Remains latin1
In older versions of MySQL or initial settings, the character encoding may be set to latin1
(for Western European languages). latin1
cannot handle Japanese correctly, and since the characters get corrupted at the point of data insertion, by the time it is saved in the database, it is already garbled.Cause 2: Mismatch in Character Encoding Between Client and Server
In MySQL, character encoding is involved at the following three timings.- Client transmission time (character_set_client)
- Server-side processing time (character_set_server)
- Result output time (character_set_results)
utf8mb4
, if the server side processes it with latin1
, the characters will be corrupted midway. This mismatch is the most common pitfall.Cause 3: Inconsistent Settings for Database, Tables, and Columns
When creating a new table, especially if you do not specify the character encoding explicitly, MySQL’s default settings are applied as is. As a result,- the database is
utf8mb4
but, - the table is
utf8
, - the column is
latin1
etc.,
Summary: Most Causes Are Due to “Character Encoding Mismatches”
Most causes of Japanese text garbling in MySQL are due to “the set character encodings not matching.” In the next section, we will explain in detail how to check MySQL’s current character encoding settings. By performing appropriate checks, you can identify the cause of the garbling and fix it quickly.3. How to Check MySQL’s Character Encoding Settings
To Pinpoint the Cause of Issues, “Checking Current Settings” Is the First Step
When Japanese cannot be handled correctly in MySQL, the first thing to check is the current settings of character encoding (character set) and collation. In MySQL, multiple character encodings are exchanged between the client and server, and they need to match. Here, we explain how to check the settings using the command line or SQL queries.SHOW VARIABLES
Command to Check Character Encoding
While connected to MySQL, you can check the current character encoding settings by executing the following SQL.SHOW VARIABLES LIKE 'character_set%';
Executing this command will produce output like the following:+--------------------------+---------+
| Variable_name | Value |
+--------------------------+---------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
+--------------------------+---------+
Meaning of Each Setting Item
Item Name | Meaning and Role |
---|---|
character_set_client | The encoding of strings sent from the client |
character_set_connection | The character encoding used during communication between client and server |
character_set_results | The character encoding when query results are returned to the client |
character_set_database | The default character encoding of the currently selected database |
character_set_server | The default character encoding when creating new databases or tables |
character_set_system | The character encoding used internally by the server (usually no need to change) |
character_set_client
, character_set_connection
, and character_set_results
match. If these three do not match, the phenomenon occurs where the sent strings arrive garbled or are returned garbled.Checkpoints to Prevent Garbled Text
- Check if all items are set to
utf8mb4
- If different character encodings are mixed, perform the setting changes introduced later
- Be careful as character encodings may be specified separately for tables or columns
Supplement: Also Check the Collation
Collation affects the sorting order and comparison methods of strings. You can check it with the following command:SHOW VARIABLES LIKE 'collation%';
It is unlikely to be the direct cause of garbled text, but since it relates to sorting and search accuracy involving Japanese, it’s reassuring to confirm that utf8mb4_general_ci
or utf8mb4_unicode_ci
is being used. In the next section, we will explain how to actually change these settings, specific methods for handling Japanese correctly in MySQL.4. How to Configure Settings to Handle Japanese Properly
Say Goodbye to Garbled Text with Proper Settings
To handle Japanese correctly in MySQL, unifying all character encoding settings is important. In particular,utf8mb4
is a recommended setting that supports not only Japanese but also emojis and special symbols. In this section, we will explain in detail how to configure settings on the client side, server side, tables, and columns.4.1 Client-Side Settings: Specify Explicitly at Connection
By executing the following command immediately after connecting to MySQL, you can fix the character encoding settings during communication toutf8mb4
.SET NAMES 'utf8mb4';
This is reflected simultaneously in the following three variables:character_set_client
character_set_connection
character_set_results
✅Note:
- When connecting from PHP, describe it like
mysqli_set_charset($conn, 'utf8mb4');
. - When using the
mysql
command in CLI, it is also effective to specify--default-character-set=utf8mb4
.
4.2 Server-Side Settings: Persistent Configuration with my.cnf
By adding the following descriptions to the server’s configuration file my.cnf
or my.ini
, you can change the default character encoding for the entire MySQL to utf8mb4
.[client]
default-character-set = utf8mb4
[mysql]
default-character-set = utf8mb4
[mysqld]
character-set-server = utf8mb4 collation-server = utf8mb4_general_ci
✅Caution:
- After changing the settings, MySQL needs to be restarted.
- Example:
sudo systemctl restart mysql
(Linux) - The file location varies by environment, and in Linux,
/etc/mysql/my.cnf
or/etc/my.cnf
are commonly used.
4.3 Specifying Character Encoding for Databases and Tables
When creating a new database or table, be sure to explicitly specify the character encoding.Example of Database Creation:
CREATE DATABASE mydb CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
Example of Table Creation:
CREATE TABLE users (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100)
) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
Changing an Existing Table:
ALTER TABLE users CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
4.4 Recommended Character Encoding: Why utf8mb4
?
MySQL also has a character encoding named utf8
, but this only supports UTF-8 representations up to a maximum of 3 bytes. Therefore, there is a problem that emojis and some kanji characters (such as variant forms) cannot be saved. On the other hand, utf8mb4
supports up to 4 bytes and is fully compatible with UTF-8, so it is now the mainstream choice to use this one. In the next chapter, we will explain specific Japanese settings and precautions when using MySQL in a Docker environment. Let’s keep in mind the points to avoid garbled text even in virtual environments.5. Handling Japanese in Docker Environments
To Handle Japanese Correctly Even in Container Environments
In recent years, the use of Docker as a development environment has increased, but we often hear complaints like “Japanese characters are garbled in MySQL on Docker.” This is caused by inappropriate locale settings in the container or initial MySQL settings. In this section, we introduce specific countermeasures for handling Japanese correctly when using MySQL in a Docker environment.5.1 Setting the Locale (Language Environment) to Japanese-Compatible in Dockerfile
Not only for MySQL containers but also when handling Japanese on the application server side, locale settings are necessary. The following is an example of a Debian-based Dockerfile:RUN apt-get update && apt-get install -y locales && locale-gen ja_JP.UTF-8 && update-locale LANG=ja_JP.UTF-8
ENV LANG=ja_JP.UTF-8
ENV LC_ALL=ja_JP.UTF-8
✅Points:
- Prevents encoding errors when reading and writing Japanese files on the application side.
- Affects not only MySQL but also execution environments like PHP and Python.
5.2 How to Specify Character Encoding for MySQL in docker-compose
When starting a MySQL container usingdocker-compose.yml
, you can specify the character encoding with environment variables as follows.services:
db:
image: mysql:8.0
container_name: mysql-ja
environment:
MYSQL_ROOT_PASSWORD: rootpass
MYSQL_DATABASE: mydb
MYSQL_USER: user
MYSQL_PASSWORD: password
TZ: Asia/Tokyo
LANG: ja_JP.UTF-8
LC_ALL: ja_JP.UTF-8
command:
--character-set-server=utf8mb4 --collation-server=utf8mb4_general_ci
ports:
- "3306:3306"
volumes:
- ./mysql-data:/var/lib/mysql
✅Note:
- You can set MySQL startup parameters in the
command:
section. TZ
andLANG
are also effective for setting up a Japanese environment.
5.3 Verifying Japanese Operation Inside the MySQL Container
To verify if MySQL is correctly set toutf8mb4
, enter the MySQL container and execute the command as follows:docker exec -it mysql-ja mysql -u root -p
After logging in, check the settings with the following command:SHOW VARIABLES LIKE 'character_set%';
If everything is set to utf8mb4
, problems with saving and displaying Japanese are less likely to occur.Summary: In Docker Environments, “Startup Settings” and “Locale” Are Key
To safely handle Japanese with MySQL even in Docker environments,- Explicitly specify
utf8mb4
when starting the MySQL container - Set the locale of the application-side container to
ja_JP.UTF-8
6. Common Issues and Their Solutions
Still Getting Garbled Text After Setup…? There Might Be Remaining Causes
Even after changing the MySQL settings toutf8mb4
, cases where Japanese does not display correctly or cannot be saved are not uncommon. This section introduces commonly reported issues and their specific solutions.Trouble 1: Settings Changes Not Reflected
Cause:After changing MySQL’s configuration files (my.cnf
or docker-compose.yml
), cases where the changes are not reflected because MySQL has not been restarted are common.Solution:- In a server environment, restart with
sudo systemctl restart mysql
- For Docker, run
docker-compose down
followed bydocker-compose up -d
Trouble 2: Japanese Characters Garbled in Terminal or Command Line
Cause:This is a case of garbled characters not due to MySQL itself, but because of the terminal’s display character encoding. For example, UTF-8 not displaying correctly in Windows Command Prompt.Solution:- For Windows: Switch to UTF-8 with the
chcp 65001
command - For macOS/Linux: Set the terminal’s encoding to UTF-8 (most are compatible by default)
Trouble 3: Existing Database or Tables Created with latin1
Cause:If not newly created, but an already operational database or tables were created with latin1
, the Japanese data within may already be corrupted.Solution:- Check the table structure:
SHOW CREATE TABLE your_table_name;
- Convert the table:
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
Note:Data that is already corrupted cannot be repaired with this operation. Take backups or dumps in advance and consider manual fixes as well.Trouble 4: Character Encoding Mismatch on the Application Side (PHP, Python, etc.)
Cause:Even if the MySQL side supportsutf8mb4
, if the strings sent by the application are in a different encoding, garbled characters will occur.Solution:- PHP:
mysqli_set_charset($conn, "utf8mb4");
- Python (MySQL Connector): Specify
charset='utf8mb4'
at connection time
Trouble 5: Garbled Characters When Integrating with CSV or Excel
Cause:During import/export with CSV or Excel, the character encoding may be Shift-JIS or UTF-8 with BOM, so attention is needed for compatibility with MySQL’sutf8mb4
.Solution:- Convert the character encoding to UTF-8 before reading the CSV
- During export, explicitly use
SET NAMES 'utf8mb4';
- When loading into Excel, save as “UTF-8 (with BOM)”
Comprehensive Checklist for Resolving Issues
Checklist Item | Status |
---|---|
All character_set_* are utf8mb4 | ✅ |
collation_server is utf8mb4_general_ci | ✅ |
Character encoding explicitly set for database, tables, and columns | ✅ |
Application’s sent character encoding is utf8mb4 | ✅ |
Encoding in usage environment (terminal, editor, etc.) is UTF-8 | ✅ |
7. Summary
Reviewing the Necessary Settings and Mindset for Handling Japanese in MySQL
To properly handle Japanese in MySQL, rather than relying on the misconception that “just setting it toutf8
for now will be fine,” consistency in settings and understanding the overall flow are important.Review of the Main Points Explained in This Article:
- The main causes of Japanese text garbling are the use of inappropriate character encodings like
latin1
, or inconsistencies in settings between the client and server. - MySQL’s character encoding settings can be checked with the
SHOW VARIABLES
command. - The recommended character encoding is
utf8mb4
. This is the complete version of UTF-8 and supports emojis and variant Chinese characters. - It is desirable to perform settings in three stages: client, server, and database/table levels.
- In Docker environments, specifying
command:
andLANG
is essential. It is necessary to adjust both the locale and character encoding. - When troubleshooting occurs, isolate and address the causes step by step. Check not only the MySQL itself but also the terminal, applications, and interactions with external data.
Points to Keep in Mind for Future Operations
- When building a new MySQL environment, design it assuming
utf8mb4
from the initial stage. - When developing in teams or multiple environments, document and share configuration files and connection parameters.
- In Docker or CI/CD environments, automation of settings (environment variables and configuration file management) is key.
- When importing and exporting data, consider using character encoding conversion tools (such as iconv or nkf).
Finally
Once you properly set up an environment for handling Japanese in MySQL, subsequent operations and development will be very smooth. If you understand “why garbling occurs” and “where and how to set it up,” you can prevent troubles in advance and achieve stable data processing. I hope this article helps make your development environment more comfortable and secure.8. Frequently Asked Questions (FAQ)
Resolving Common Questions About MySQL and Japanese Characters
Q1. Japanese characters are displayed as “???” in MySQL. What is the cause?
A.The main cause of displaying “???” is a mismatch in character encoding. For example, if the client sends Japanese inutf8mb4
and the server receives it in latin1
, garbled characters occur.
Executing SET NAMES 'utf8mb4';
at the time of connection resolves the issue in many cases.Q2. Even after setting utf8mb4
in my.cnf
, it is not reflected.
A.Simply editing my.cnf
will not apply the changes. You need to restart the MySQL server.On Linux, use sudo systemctl restart mysql
; in a Docker environment, be sure to run docker-compose down
followed by docker-compose up -d
.Q3. Japanese characters are garbled in an existing table. Can it be fixed?\n
A.Complete repair can be difficult, but the following steps can address it.- Check the table structure (
SHOW CREATE TABLE
) - Convert the character encoding
ALTER TABLE your_table_name CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
However, if the already saved data is corrupted, you may need restoration from a backup or manual correction.Q4. I’m using MySQL in Docker, but Japanese input causes garbled characters.
A.In addition to MySQL settings, you need to add locale settings (such asLANG=ja_JP.UTF-8
) in the Dockerfile or docker-compose.yml. Also, explicitly specify --character-set-server=utf8mb4
in the startup command for the MySQL container.Q5. What is the difference between utf8
and utf8mb4
? Which one should I use?
A.MySQL’s utf8
actually only handles 3-byte UTF-8 compatible characters. On the other hand, utf8mb4
supports 4 bytes and can properly handle emojis and some kanji characters.
Currently, utf8mb4
is recommended from the perspectives of compatibility and future-proofing.Q6. CSV files exported from Excel are garbled. What should I do?
A.Excel defaults to using Shift_JIS or BOM-attached UTF-8 in some cases, which can cause a mismatch with MySQL’s character encoding. Save the CSV file in UTF-8 or executeSET NAMES 'utf8mb4';
when importing to align the encoding on the MySQL side.If this FAQ does not resolve your issue, review the settings from the beginning or rebuild the development environment for each setup as one approach. Patiently addressing technical challenges leads to proper handling of Japanese data.