Mysql DBA 高级运维学习之路-mysql数据库乱码问题
1.在mysql数据库中插入数据不乱码的方法
1.1 建立测试数据
(1)创建linzhongniao测试数据库并查看建表语句
mysql> create database linzhongniao;Query OK, 1 row affected (0.00 sec)mysql> show create database linzhongniao\G *************************** 1. row *************************** Database: linzhongniaoCreate Database: CREATE DATABASE `linzhongniao` /*!40100 DEFAULT CHARACTER SET latin1 */1 row in set (0.00 sec)
提示:如果在安装MySQL时未指定字符集或者指定的是latin字符集,则mysql默认字符集是latin1。
(2)在linzhongniao库下创建一个student表,并查看表结构和建表语句
mysql> use linzhongniaoDatabase changedmysql> show tables;+--------------------+| Tables_in_linzhongniao |+--------------------+| student|+--------------------+1 row in set (0.00 sec)mysql> create table student( id int(4) NOT NULL AUTO_INCREMENT, name char(20) NOT NULL, PRIMARY KEY(id) );mysql> desc student;+-------+----------+------+-----+---------+----------------+| Field | Type | Null | Key | Default | Extra |+-------+----------+------+-----+---------+----------------+| id| int(4) | NO | PRI | NULL| auto_increment || name | char(20) | NO | | NULL||+-------+----------+------+-----+---------+----------------+2 rows in set (0.00 sec)mysql> show create table student\G*************************** 1. row *************************** Table: studentCreate Table: CREATE TABLE `student` ( `id` int(4) NOT NULL AUTO_INCREMENT, `name` char(20) NOT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin11 row in set (0.00 sec)
提示:默认建表不指定字符集则继承库的字符集,即latin1。
(3)批量插入数据到student表
mysql> insert into student values(1,'zhangsan'),(2,'lisi'),(3,'xiaozhang'),(4,'xiaohong');Query OK, 4 rows affected (0.01 sec)Records: 4 Duplicates: 0 Warnings: 0mysql> select * from student;+----+-----------+| id | name |+----+-----------+| 1 | zhangsan || 2 | lisi || 3 | xiaozhang || 4 | xiaohong |+----+-----------+4 rows in set (0.00 sec)
提示:数字和英文的数据时正常的,不会有乱码问题。
(4)插入两条中文数据
mysql> insert into student values(5,'我是谁');Query OK, 1 row affected, 1 warning (0.00 sec)mysql> insert into student values(6,'你好啊');Query OK, 1 row affected (0.00 sec)mysql> select * from student;+----+-----------+| id | name |+----+-----------+| 1 | zhangsan || 2 | lisi || 3 | xiaozhang || 4 | xiaohong || 5 | ??? || 6 | ??? |+----+-----------+8 rows in set (0.00 sec)
出现问题:中文内容乱码
为什么插入中文数据会出现乱码问题呢?
通过上面的例子我们可以看出客户端字符集和库,表字符集不一样导致乱码问题,所以我们在插入数据的时候要先查看系统字符集和客户端,库表字符集是否一样,不一样将字符集修改一致再插入数据。已经插入的数据有乱码可以将数据导出备份添加修改字符集命令后再重新导入。
1.2 方法一执行set names命令
命令语法:set names 接指定字符集
(1)查看建表语句,注意默认的字符集是latin1
mysql> show create table student\G*************************** 1. row *************************** Table: studentCreate Table: CREATE TABLE `student` ( `id` int(4) NOT NULL AUTO_INCREMENT, `name` char(20) NOT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin11 row in set (0.00 sec)
(2)设置MySQL客户端的字符集和建表的字符集latin1一致
设置插入数据的字符集为latin
mysql> set names latin1;Query OK, 0 rows affected (0.00 sec)
(3)再插入一条中文数据
mysql> insert into student values(7,'林中鸟');Query OK, 1 row affected (0.00 sec)mysql> select * from student where id='7';+----+-----------+| id | name |+----+-----------+| 7 | 林中鸟 |+----+-----------+1 row in set (0.00 sec)
提示:不乱码了,但是以前的数据就没办法解决了。
上面的是MySQL命令行插入数据不乱码的方法,那么如果更新的数据多就需要执行sql文件更新数据了,所以保证执行sql文件也不乱码怎么办呢?
1.3 方法二执行sql文件
(1)将要更新的多个sql语句放在文本文件中如test.sql
需要用system命令,执行system命令可以不退出数据库对系统的文件进行引用和查看。当然也可以退出数据库这样会比较麻烦。
mysql> system cat test.sqlset names latin1;insert into student values(8,'不认识');mysql> system ls;beifen.sh test.sql
提示:必须要加入set names latin1,确保插入数据不乱码。
(2)在MySQL命令行中通过source调用test.sql文件插入数据
用source命令执行sql文件实现对数据库的操作,可以恢复数据库的数据当然也可以退出数据库用输入重定向执行sql文件对数据库的数据进行恢复。
mysql> source test.sqlQuery OK, 0 rows affected (0.01 sec)Query OK, 1 row affected (0.00 sec)mysql> select * from student where id='8';+----+-----------+| id | name |+----+-----------+| 8 | 不认识 |+----+-----------+1 row in set (0.00 sec)
小结:执行DQL,DML语句的时候要set names 保持库和表的字符集一致,还要调整客户端的字符集。
1.3 方法三添加字符集参数解决导入数据乱码问题
(1)把要更新的多个SQL语句放入文本中,这次不带set names latin1
#set names latin1;insert into student values(9,'小红');
(2)通过MySQL命令加上字符集参数指定latin1字符集导入test.sql
[root@localhost ~]# mysql -uroot -p123456 --default-character-set=latin1 linzhongniao < test.sql
(3)通过-e参数在mysql库外查看结果
[root@localhost ~]# mysql -uroot -p123456 -e "select * from linzhongniao.student where id='9'"+----+--------+| id | name |+----+--------+| 9 | 小红 |+----+--------+
1.4 MySQL插入中文不乱码5中方法小结
方法一:执行set names命令再插入数据
mysql> set names latin1;Query OK, 0 rows affected (0.00 sec)mysql> insert into student values(7,'林中鸟');Query OK, 1 row affected (0.00 sec)
提示:确保test.sql文件格式正确
方法二:在sql文件中指定set names latin1;然后登录mysql,通过如下命令执行。
mysql> system cat test.sqlset names latin1;insert into student values(8,'不认识');mysql> source test.sqlQuery OK, 0 rows affected (0.00 sec)
方法三:在sql文件中指定set names latin1 然后通过mysql导入
[root@localhost ~]# mysql -uroot -p123456 linzhongniao < test.sql[root@localhost ~]# mysql -uroot -p123456 -e "set names latin1;select * from linzhongniao.student"
提示:这里的linzhongniao是库名不是表名。
方法四:通过指定mysql命令的字符集参数来实现
#set names latin1;insert into student values(9,'李四');[root@localhost ~]# mysql -uroot -p123456 --default-character-set=latin1 linzhongniao < test.sql
方法五:在配置文件里设置客户端及服务端相关参数
(1)更改my.cnf客户端client模块的参数,可以实现set names latin1效果,并永久生效
[client]default-character-set=latin1
提示:不需要重启服务,退出重新登陆生效。
(2)在服务端mysqld模块里面再指定latin1字符集
[mysqld]default-character-set=latin1适合5.1 及以前呢版本default-character-server=latin1 适合5.5
2.mysql数据库字符集知识
2.1 MySQL数据库字符集介绍
简单的说是一套文字符号及其编码、比较规则的集合。MySQL数据库字符集包括字符集(CHARACTER)和校对规则(COLLATION)两个概念。其中,字符集是用来定义MySQL数据字符串的存储方式,而校对规则则是定义比较字符串的方式。前面建库的语句中CHARACTER SET latin1即为数据库字符集而COLLATE latin1_swedish_ci 为校对字符集,有关字符集详细内容参考mysql手册,第10张字符集章节。
2.2 MySQL数据库常见字符集介绍
使用MySQL时常用的字符集有下表四种
2.3 MySQL如何选择合适的字符集
(1)如果处理各种各样的文字,发布到不同国家和地区,应选Unicode字符集。对mysql来说就是UTF-8(每个汉字三个字节),如果应用需处理英文,有少量汉字使用UTF-8字符集更好。
(2)如果只需支持中文,并且数据量很大,性能要求也很高,可选GBK(定长,每个汉字占双字节,英文也占双字节),处理大量运算,比较顺序等定长字符集更快,性能高。
(3)处理移动互联网业务,可能需要使用utf8mb4字符集。
2.4 查看当前MySQL系统支持的字符集
最常用的有四种:
[root@localhost ~]# mysql -uroot -p123456 -e "SHOW CHARACTER SET;"|egrep "gbk|utf8|latin1"|awk ' {print $0}'latin1 cp1252 West European latin1_swedish_ci 1gbk GBK Simplified Chinese gbk_chinese_ci 2utf8 UTF-8 Unicode utf8_general_ci 3utf8mb4 UTF-8 Unicode utf8mb4_general_ci 4
查看mysql当前的字符集设置情况
mysql> show variables like 'character_set%';+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| utf8 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
提示:默认情况下character_set_client,character_set_connection,character_set_results三者的字符集和系统的字符集是一致的,是同时修改的。即为:
[root@localhost ~]# cat /etc/sysconfig/i18n LANG="zh_CN.UTF-8"[root@localhost ~]# echo $LANGzh_CN.UTF-8
3.mysql数据库默认设置的字符集是什么?
(1)先看一下mysql默认情况下设置的字符集
mysql> show variables like 'character_set%';+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | gb2312 || character_set_connection | gb2312 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| gb2312 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
(2)不同字符集参数的含义如下
| character_set_client | latin1 客户端字符集| character_set_connection | latin1 连接字符集| character_set_database | latin1数据库字符集,配置文件指定或建库建表指定| character_set_results| latin1 返回结果字符集| character_set_server | latin1服务器字符集,配置文件指定或建库建表指定
更改linux系统字符集变量后,查看MySQL中字符集的变化
[root@localhost ~]# echo $LANGzh_CN.UTF-8[root@localhost ~]# mysql -uroot -p123456 -e "show variables like 'character_set%';"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| utf8 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
我们发现character_set_connection,character_set_client,character_set_server 三者的字符集和系统的一致也都改成utf8了。
4.执行set names latin1到底做了什么
无论linux系统的字符集是gb2312还是utf8默认情况下插入数据都是乱码的。
(1)此时查看数据就是乱码
mysql> use linzhongniaoDatabase changedmysql> select * from student-> ;+----+---------------------+| id | name|+----+---------------------+| 1 | zhangsan|| 2 | lisi|| 3 | wanger || 4 | xiaozhang || 5 | xiaowang|| 6 | ??? || 7 | å°çº¢ || 8 | ä¸è®¤è¯† || 9 | æŽå›› |+----+---------------------+9 rows in set (0.10 sec)
(2)执行完set对应的字符集操作,就解决乱码问题了
mysql> show create database linzhongniao\G*************************** 1. row *************************** Database: linzhongniaoCreate Database: CREATE DATABASE `linzhongniao` /*!40100 DEFAULT CHARACTER SET latin1 */1 row in set (0.00 sec)mysql> show create table student\G*************************** 1. row *************************** Table: studentCreate Table: CREATE TABLE `student` ( `id` int(4) NOT NULL AUTO_INCREMENT, `name` char(20) NOT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=latin11 row in set (0.00 sec)
我们看库和表的字符集都是latin1,所以执行set names latin1保证字符集一样就不会乱码了。
mysql> set names latin1;Query OK, 0 rows affected (0.00 sec)mysql> select * from student;+----+-----------+| id | name |+----+-----------+| 1 | zhangsan || 2 | lisi || 3 | wanger|| 4 | xiaozhang || 5 | xiaowang || 6 | ??? || 7 | 小红 || 8 | 不认识|| 9 | 李四 |+----+-----------+
(3)执行完set字符集操作的结果改变了如下字三个字符集character_set_client,character_set_connection,character_set_results的参数。
mysql> show variables like 'character_set%';+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | latin1 || character_set_connection | latin1 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| latin1 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
5.mysql命令参数-default-character-set=latin1在做什么?
(1)先查看一下mysql的字符集
[root@localhost ~]# mysql -uroot -p123456 -e "show variables like 'character_set%';"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| utf8 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
(2)带-default-character-set=latin1 参数登录mysql
[root@localhost ~]# mysql -uroot -p123456 --default-character-set=latin1Welcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 7Server version: 5.5.32 Source distributionCopyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.mysql>
(3)现在再查看mysql的字符集
mysql> show variables like 'character_set%';+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | latin1 || character_set_connection | latin1 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| latin1 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
(4)带参数登录也是临时修改不带参数登录又变回去了
[root@localhost ~]# mysql -uroot -p123456 --default-character-set=latin1 -e "show variables like 'character_set%';"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | latin1 || character_set_connection | latin1 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| latin1 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+[root@localhost ~]# mysql -uroot -p123456 -e "show variables like 'character_set%';"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| utf8 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
6.确保MySQL数据库插入数据不乱码解决方案
6.1 统一客户端字符集
(1)MySQL数据库的下面几个字符集(客户端和服务端)统一成一个字符集才能确保插入的中文数据库可以正常输出。当然,linux系统的字符集也要尽可能和数据库字符集统一。
(2)mysql数据库字符集的含义:
Variable_name | Value +--------------------------+--------------------------------+①character_set_client | latin1 客户端字符集②character_set_connection | latin1 连接字符集③character_set_database | latin1 数据库字符集④character_set_results | latin1 返回结果字符集⑤character_set_server | latin1 服务器字符集,配置文件制定或建库建表指定
其中,①②④三个参数默认情况采用linux系统字符集设置,人工登录数据库执行set names latin1以及mysql指定字符集登录操作,都是改变mysql客户端的client、connection、results3个参数的字符集都为latin1,从而解决插入乱码问题,这个操作可以在my.cnf配置文件里修改mysql客户端的字符集,配置方法如下:
[client]Default-character-set=latin1提示:不需要重启[root@localhost ~]# sed -n "18,22p" /etc/my.cnf [client]#password = your_passwordport = 3306socket = /usr/local/mysql/tmp/mysql.sockdefault-character-set = latin1[root@localhost ~]# mysql -uroot -p123456 -e "show variables like 'character_set%';"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | latin1 || character_set_connection | latin1 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| latin1 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
(3)修改完客户端字符集不用set查询表数据就不会乱码了
[root@localhost ~]# mysql -uroot -p123456 -e "select * from linzhongniao.student;"+----+-----------+| id | name |+----+-----------+| 1 | zhangsan || 2 | lisi || 3 | wanger|| 4 | xiaozhang || 5 | xiaowang || 6 | ??? || 7 | 小红|| 8 | 不认识 || 9 | 李四|+----+-----------+
6.2 统一MySQL服务端字符集
(1) 按下面要求修改my.cnf参数
[mysqld]default-character-set = latin1 适合5.1及以前版本character-set-server = utf8 适合5.5版本
(2) 修改前查看当前字符集
[root@localhost ~]# mysql -uroot -p123456 -e "show variables like 'character_set%';"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database | latin1 || character_set_filesystem | binary || character_set_results| utf8 || character_set_server | latin1 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
(3) 查看修改的参数
[root@localhost ~]# sed -n "26,27p" /etc/my.cnf [mysqld]character-set-server = utf8
(4) 重启mysql服务(生产环境是不允许重启的)
[root@localhost ~]# /etc/init.d/mysqld restartShutting down MySQL.. SUCCESS! Starting MySQL.. SUCCESS!
(5) 查看更改后的字符集
[root@localhost ~]# mysql -uroot -p123456 -e "show variables like 'character_set%';"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database | utf8 || character_set_filesystem | binary || character_set_results| utf8 || character_set_server | utf8 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
提示:以上在[mysqld]下设置的参数会更改下面2个参数的字符集设置。
| Variable_name | Value|| character_set_database | utf8 || character_set_server | utf8 |
这个时候我们再修改系统字符集mysql数据库字符集就不乱码了
[root@localhost ~]# cat /etc/sysconfig/i18n LANG="zh_CN.GB2312"#LANG="zh_CN.UTF-8"[root@localhost ~]# source /etc/sysconfig/i18n [root@localhost ~]# mysql -uroot -p123456 -e "show variables like 'character_set%';"+--------------------------+----------------------------------+| Variable_name| Value|+--------------------------+----------------------------------+| character_set_client | utf8 || character_set_connection | utf8 || character_set_database | utf8 || character_set_filesystem | binary || character_set_results| utf8 || character_set_server | utf8 || character_set_system | utf8 || character_sets_dir | /usr/local/mysql/share/charsets/ |+--------------------------+----------------------------------+
6.3 统一字符集方法总结
保证数据库数据不乱码的方法:建议中英文环境选择utf8 ,linux系统,客户端,服务端,库,表,程序字符集统一。
(1)Linux系统字符集统一utf8
[root@localhost ~]# cat /etc/sysconfig/i18n LANG="zh_CN.UTF-8"
提示linux客户端也要更改字符集 例如:xshell
例如:SecureCRT
(2)Mysql数据库客户端
临时:
set names latin1
永久:
更改my.cnf客户端模块的参数,可以实现set names latin1效果,并永久生效。
[client]
Default-character-set=latin1
(3)服务端
更改my.cnf参数
[mysqld]Default-character-set = latin1 适合5.1及以前版本character-set-server = latin1 适合5.5
(4)库表,程序指定字符集建库
create database linzhongniao_utf8 DEFAULT CHARACTER SET UTF8 COLLATE 后面加校对规则
我们可以show一下查看支持的校对规则
[root@localhost ~]# mysql -uroot -p123456 -e "SHOW CHARACTER SET;"|egrep "gbk|utf8|latin1"|awk ' {print $0}'latin1 cp1252 West European latin1_swedish_ci 1gbk GBK Simplified Chinese gbk_chinese_ci 2utf8 UTF-8 Unicode utf8_general_ci 3utf8mb4 UTF-8 Unicode utf8mb4_general_ci 4
7.如何更改生产MySQL数据库库表的字符集
对于已有数据库想修改字符集不能直接通过"alter database character set "或者"alter table tablename character set ",这两个命令都不能更新已有数据的字符集。而只是对新创建的表或者数据生效。
已经有记录的字符集的调整必须将数据导出,经过修改字符集之后重新导入才可完成。
修改数据库默认编码命令:
alter database [your db name] charset [your character setting]
下面模拟将latin1字符集的数据库修改成GBK字符集的过程。
(1)导出表结构
使用mysqldump的-d参数导出表结构
mysqldump -uroot -p123456 --default-character-set=latin1 -d dbname>alltable.sql --default-character-set=gbk 表示以GBK字符集进行连接 -d只导表结构。
(2)然后编辑alltable.sql将latin1用sed替换成GBK
(3)确保数据不在更新导出所有数据
mysqldmup -uroot -p123456 --quick --no-create-info --extended-insert --default-character-set=latin1 dbname>alltables.sql
参数说明:
--quick:用于转储大的表,强制mysqldump从服务器一次一行的检索数据而不是检索所有行并输出前CACHE到内存中。 --no-create-info:不创建CREATE TABLE 语句。 --extended-insert:使用包括几个VALUES列表的多行INSERT语法,这样文件更小节省IO导入数据非常快。 --default-character-set=latin1按照原有字符集导出数据,这样导出的文件中,所有中文都是可见的,不会保存成乱码。
(4)打开alltable.sql将set names latin1修改成set names gbk(或者修改my.cnf配置文件)
(5)建库
create database dbname default charset gbk;
(6)创建表执行,alltable.sql
mysql -uroot -p123456 dbname
(7)导入数据
mysql -uroot -p123456 dbname