Percona Server 5.5 資料庫utf8mb4字集轉換

MySQL Server 8.0預設字集為utf8mb4, 即4 bytes UTF-8, 支援了行動世代的繪文字(Emoji)與罕見字, 本文將實際試驗將資料庫utf8 (3 bytes UTF-8)轉為utf8mb4, 以Percona Server 5.5為例。建議您在測試環境轉換, 確認資料庫與應用程式正常運作再到生產環境轉換。

首先介紹測試用的環境與工具程式
1. 作業系統Debian Linux 9.5
2. MySQL資料庫: Percona Server 5.5.61
3. Percona ToolKit

Percona ToolKit的安裝方法
apt-get install percona-toolkit


這裡我以名稱test-db的資料庫做為要轉換utf8mb4範例, 請替換此名稱即可
MySQL的my.cnf設定檔確認有以下設定, 若無則加入並重新啟動, 支援超過key長度767 Bytes限制
innodb_file_format = Barracuda
innodb_file_format_max = Barracuda
innodb_file_per_table
innodb_large_prefix = 1


重新啟動MySQL
service mysql restart


使用Percona ToolKit轉換InnoDB表單為utf8mb4字集, 好處是可以整批處理, 或是把pt-find放置背景執行, 不用死盯著螢幕
pt-find --engine InnoDB --exec "ALTER TABLE %D.%N ROW_FORMAT=DYNAMIC, CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci" --print --password mypassword --user root test-db

ROW_FORMAT=DYNAMIC可將InnoDB原本Redundant或Compact文件格式轉為Dynamic, 並列出被轉換的資料表

若有其他類型表單, 例如MEMORY
pt-find --engine MEMORY --exec "ALTER TABLE %D.%N CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci" --print --password mypassword --user root test-db

同樣會列出被轉換的資料表

轉換完成時間看資料庫大小, 因為做了ALTER TABLE會花上很多時間產生新的utf8mb4資料表
接著轉換資料庫, 進入MySQL console
mysql -u root -p


使用test-db資料庫
use test-db;


轉換資料庫test-db的字集為utf8mb4
ALTER DATABASE test-db CHARACTER SET = utf8mb4 COLLATE = utf8mb4_unicode_ci;


確認是否轉為utf8mb4
SHOW CREATE DATABASE test-db;

會有像這樣的訊息
+----------+------------------------------------------------------------------------------------------------+
| Database | Create Database |
+------------------+----------------------------------------------------------------------------------------+
| test-db | CREATE DATABASE `test-db` /*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci */ |
+------------------+----------------------------------------------------------------------------------------+
1 row in set (0.00 sec)


接著設定my.cnf, 將原本用utf8改為utf8mb4
character_set_client = utf8mb4
character_set_server = utf8mb4
collation_server = utf8mb4_unicode_ci
init_connect = 'SET collation_connection = utf8mb4_unicode_ci,NAMES utf8mb4'


重新啟動MySQL
service mysql restart


回到MySQL console, 檢視預設字集
use information_schema;
SELECT * FROM GLOBAL_VARIABLES WHERE VARIABLE_NAME LIKE '%coll%' OR VARIABLE_NAME LIKE '%char%' OR VARIABLE_NAME='init_connect';

會有像這樣的訊息
+--------------------------+-------------------------------------------------------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+--------------------------+-------------------------------------------------------------+
| CHARACTER_SET_CLIENT | utf8mb4 |
| CHARACTER_SETS_DIR | /usr/share/mysql/charsets/ |
| CHARACTER_SET_RESULTS | utf8mb4 |
| CHARACTER_SET_CONNECTION | utf8mb4 |
| CHARACTER_SET_DATABASE | utf8mb4 |
| COLLATION_SERVER | utf8mb4_unicode_ci |
| CHARACTER_SET_SERVER | utf8mb4 |
| CHARACTER_SET_SYSTEM | utf8 |
| INIT_CONNECT | SET collation_connection = utf8mb4_unicode_ci,NAMES utf8mb4 |
| CHARACTER_SET_FILESYSTEM | binary |
| COLLATION_DATABASE | utf8mb4_unicode_ci |
| COLLATION_CONNECTION | utf8mb4_unicode_ci |
+--------------------------+-------------------------------------------------------------+
12 rows in set (0.00 sec)


轉換utf8mb4有許多好處, 應用程式可以少處理4 bytes UTF-8, 改用MySQL 8.0也是推薦!
Percona也於今日(台灣時間11月1日)發佈了Percona Server for MySQL 8.0.12的Release Candidate版本
Percona Server for MySQL 8.0 Delivers Increased Reliability, Performance and Security

今日熱門文章 網友點擊推薦!

文章分享
評分
複製連結