mysql高可用MHA部署全过程
部署计划
mysql_master 192.168.2.74 centos6.9 mysql5.5/mha-node mysql_salve1 192.168.2.75 centos6.9 mysql5.5/mha-node mysql_salve2 192.168.2.76 centos6.9 mysql5.5/mha-node/mha-man 本次部署采用3台服务器,mha-manager不单独使用一台服务器安装,生产上可以单独出来,本次使用采用centos6.9系统(使用 http://youprince.blog.51cto.com/9272426/1974967 优化 ),mysql5.5(使用ansible安装,本次不做介绍),3台都参与竞争,并使用VIP:192.168.2.199。
安装提前
a) 配置好主机名
b) 设置host文件
####3台主机分别执行cat >> /etc/hosts <c) 3台主机之间使用免密码登陆认证
ssh-kekgen -t rsa #一直回车就行ssh-copy-id -i mysql_master #输入yes之后输入密码ssh-copy-id -i mysql_slave1 #输入yes之后输入密码ssh-copy-id -i mysql_slave2 #输入yes之后输入密码3台主机安装mysql并设置同步mysql安装这边就不做介绍了。现在开始mysql同步设置在mysql_master上执行 查看同步bin文件和pos信号
mysql> show master status;+------------------+----------+--------------+------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |+------------------+----------+--------------+------------------+| mysql-bin.000003 | 107 | | |+------------------+----------+--------------+------------------+1 row in set (0.00 sec)mysql> grant replication slave on *.* to 'repl'@'192.168.2.75' identified by 'admin123';Query OK, 0 rows affected (0.00 sec)mysql> grant replication slave on *.* to 'repl'@'192.168.2.76' identified by 'admin123';Query OK, 0 rows affected (0.00 sec)在mysql_salve1和 mysql_slave2上配置同步
mysql> change master to master_host='192.168.2.74',master_port=55555,master_user='repl',master_password='admin123',master_log_file='mysql-bin.000003',master_log_pos=107;Query OK, 0 rows affected (0.00 sec)mysql> start salve;mysql> show slave status\G;在mysql_master上安装插件
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so'; Query OK, 0 rows affected (0.02 sec)mysql> install plugin rpl_semi_sync_slave soname 'semisync_slave.so';Query OK, 0 rows affected (0.01 sec)mysql> set global rpl_semi_sync_master_enabled=on ;Query OK, 0 rows affected (0.00 sec)在mysql_salve1和mysql_slave2上安装插件
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so'; Query OK, 0 rows affected (0.01 sec)mysql> install plugin rpl_semi_sync_slave soname 'semisync_slave.so';Query OK, 0 rows affected (0.00 sec)mysql> set global rpl_semi_sync_slave_enabled=on ;Query OK, 0 rows affected (0.00 sec)mysql> set global relay_log_purge=0;Query OK, 0 rows affected (0.00 sec)4.开始安装mha-node
3台mysql都需要安装 先安装依赖 yum install perl-DBD-MySQL
[root@mysql_master ~]# rpm -ivh mha4mysql-node-0.54-0.el6.noarch.rpm Preparing... ########################################### [100%] 1:mha4mysql-node ########################################### [100%][root@mysql_master ~]#在需要参与选举的mysql上新建账号(我这边3个都选举所以3个都要执行下面的.我不需要的只读的库),mha账号是mha-man用来管理数据库的权限比较大生产的时候需要注意,repl是用来的同步的账号
grant all privileges on *.* to 'mha'@'192.168.2.74' identified by 'admin123';grant all privileges on *.* to 'mha'@'192.168.2.75' identified by 'admin123';grant all privileges on *.* to 'mha'@'192.168.2.76' identified by 'admin123';grant replication slave on *.* to 'repl'@'192.168.2.74' identified by 'admin123';grant replication slave on *.* to 'repl'@'192.168.2.75' identified by 'admin123';grant replication slave on *.* to 'repl'@'192.168.2.76' identified by 'admin123';flush privileges;执行情况如下(只是列举了其中一台)
mysql> grant all privileges on *.* to 'mha'@'192.168.2.74' identified by 'admin123';Query OK, 0 rows affected (0.00 sec)mysql> grant all privileges on *.* to 'mha'@'192.168.2.75' identified by 'admin123';Query OK, 0 rows affected (0.00 sec)mysql> grant all privileges on *.* to 'mha'@'192.168.2.76' identified by 'admin123';Query OK, 0 rows affected (0.00 sec)mysql> grant replication slave on *.* to 'repl'@'192.168.2.74' identified by 'admin123';Query OK, 0 rows affected (0.00 sec)mysql> grant replication slave on *.* to 'repl'@'192.168.2.75' identified by 'admin123';Query OK, 0 rows affected (0.00 sec)mysql> grant replication slave on *.* to 'repl'@'192.168.2.76' identified by 'admin123';Query OK, 0 rows affected (0.00 sec)mysql> flush privileges;Query OK, 0 rows affected (0.00 sec)5. 开始安装mha-manager 节点(可以单独拿一台主机安装,我这里使用的是其中一个节点mysql_slave2)
安装依赖
yum install -y perl-DBD-MySQLyum install -y perl-Config-Tinyyum install -y perl-Log-Dispatchyum install -y perl-Parallel-ForkManageryum install -y perl-Time-HiResyum install -y perl-devel安装mha-manager
[root@mysql_slave2 ~]# rpm -ivh mha4mysql-manager-0.55-0.el6.noarch.rpm Preparing... ########################################### [100%] 1:mha4mysql-manager ########################################### [100%]创建配置文件
mkdir -p /etc/mha/{conf,logs,work} #创建的mha-manager使用目录 conf放配置文件##### 创建app1.conf###文件cat > /etc/mha/conf/app1.cnf <检测ssh连接
[root@mysql_slave2 conf]# masterha_check_ssh --conf=/etc/mha/conf/app1.cnfSat Nov 4 00:49:15 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.Sat Nov 4 00:49:15 2017 - [info] Reading application default configurations from /etc/mha/conf/app1.cnf..Sat Nov 4 00:49:15 2017 - [info] Reading server configurations from /etc/mha/conf/app1.cnf..Sat Nov 4 00:49:15 2017 - [info] Starting SSH connection tests..Sat Nov 4 00:49:16 2017 - [debug] Sat Nov 4 00:49:15 2017 - [debug] Connecting via SSH from root@mysql_master(192.168.2.74:22) to root@mysql_slave1(192.168.2.75:22)..Sat Nov 4 00:49:15 2017 - [debug] ok.Sat Nov 4 00:49:15 2017 - [debug] Connecting via SSH from root@mysql_master(192.168.2.74:22) to root@mysql_slave2(192.168.2.76:22)..Sat Nov 4 00:49:16 2017 - [debug] ok.Sat Nov 4 00:49:16 2017 - [debug] Sat Nov 4 00:49:16 2017 - [debug] Connecting via SSH from root@mysql_slave1(192.168.2.75:22) to root@mysql_master(192.168.2.74:22)..Sat Nov 4 00:49:16 2017 - [debug] ok.Sat Nov 4 00:49:16 2017 - [debug] Connecting via SSH from root@mysql_slave1(192.168.2.75:22) to root@mysql_slave2(192.168.2.76:22)..Sat Nov 4 00:49:16 2017 - [debug] ok.Sat Nov 4 00:49:17 2017 - [debug] Sat Nov 4 00:49:16 2017 - [debug] Connecting via SSH from root@mysql_slave2(192.168.2.76:22) to root@mysql_master(192.168.2.74:22)..Sat Nov 4 00:49:16 2017 - [debug] ok.Sat Nov 4 00:49:16 2017 - [debug] Connecting via SSH from root@mysql_slave2(192.168.2.76:22) to root@mysql_slave1(192.168.2.75:22)..Sat Nov 4 00:49:17 2017 - [debug] ok.Sat Nov 4 00:49:17 2017 - [info] [root@mysql_slave2 conf]#这说明开始的ssh免密码配置ok
检测mysql同步配置
[root@mysql_slave2 conf]# masterha_check_repl --conf=/etc/mha/conf/app1.cnfSat Nov 4 00:57:03 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.Sat Nov 4 00:57:03 2017 - [info] Reading application default configurations from /etc/mha/conf/app1.cnf..Sat Nov 4 00:57:03 2017 - [info] Reading server configurations from /etc/mha/conf/app1.cnf..Sat Nov 4 00:57:03 2017 - [info] MHA::MasterMonitor version 0.55.Sat Nov 4 00:57:03 2017 - [info] Dead Servers:Sat Nov 4 00:57:03 2017 - [info] Alive Servers:Sat Nov 4 00:57:03 2017 - [info] mysql_master(192.168.2.74:55555)Sat Nov 4 00:57:03 2017 - [info] mysql_slave1(192.168.2.75:55555)Sat Nov 4 00:57:03 2017 - [info] mysql_slave2(192.168.2.76:55555)Sat Nov 4 00:57:03 2017 - [info] Alive Slaves:Sat Nov 4 00:57:03 2017 - [info] mysql_slave1(192.168.2.75:55555) Version=5.5.57-log (oldest major version between slaves) log-bin:enabledSat Nov 4 00:57:03 2017 - [info] Replicating from 192.168.2.74(192.168.2.74:55555)Sat Nov 4 00:57:03 2017 - [info] Primary candidate for the new Master (candidate_master is set)Sat Nov 4 00:57:03 2017 - [info] mysql_slave2(192.168.2.76:55555) Version=5.5.57-log (oldest major version between slaves) log-bin:enabledSat Nov 4 00:57:03 2017 - [info] Replicating from 192.168.2.74(192.168.2.74:55555)Sat Nov 4 00:57:03 2017 - [info] Primary candidate for the new Master (candidate_master is set)Sat Nov 4 00:57:03 2017 - [info] Current Alive Master: mysql_master(192.168.2.74:55555)Sat Nov 4 00:57:03 2017 - [info] Checking slave configurations..Sat Nov 4 00:57:03 2017 - [info] read_only=1 is not set on slave mysql_slave1(192.168.2.75:55555).Sat Nov 4 00:57:03 2017 - [info] read_only=1 is not set on slave mysql_slave2(192.168.2.76:55555).Sat Nov 4 00:57:03 2017 - [info] Checking replication filtering settings..Sat Nov 4 00:57:03 2017 - [info] binlog_do_db= , binlog_ignore_db= Sat Nov 4 00:57:03 2017 - [info] Replication filtering check ok.Sat Nov 4 00:57:03 2017 - [info] Starting SSH connection tests..Sat Nov 4 00:57:04 2017 - [info] All SSH connection tests passed successfully.Sat Nov 4 00:57:04 2017 - [info] Checking MHA Node version..Sat Nov 4 00:57:05 2017 - [info] Version check ok.Sat Nov 4 00:57:05 2017 - [info] Checking SSH publickey authentication settings on the current master..Sat Nov 4 00:57:05 2017 - [info] HealthCheck: SSH to mysql_master is reachable.Sat Nov 4 00:57:05 2017 - [info] Master MHA Node version is 0.54.Sat Nov 4 00:57:05 2017 - [info] Checking recovery script configurations on the current master..Sat Nov 4 00:57:05 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql/55555/logs/ --output_file=/var/tmp/save_binary_logs_test --manager_version=0.55 --start_file=mysql-bin.000006 Sat Nov 4 00:57:05 2017 - [info] Connecting to root@mysql_master(mysql_master).. Creating /var/tmp if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /data/mysql/55555/logs/, up to mysql-bin.000006Sat Nov 4 00:57:05 2017 - [info] Master setting check done.Sat Nov 4 00:57:05 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..Sat Nov 4 00:57:05 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=mysql_slave1 --slave_ip=192.168.2.75 --slave_port=55555 --workdir=/var/tmp --target_version=5.5.57-log --manager_version=0.55 --relay_log_info=/data/mysql/55555/relay-log.info --relay_dir=/data/mysql/55555/ --slave_pass=xxxSat Nov 4 00:57:05 2017 - [info] Connecting to root@192.168.2.75(mysql_slave1:22).. Checking slave recovery environment settings.. Opening /data/mysql/55555/relay-log.info ... ok. Relay log found at /data/mysql/55555, up to 55555-relay-bin.000002 Temporary relay log file is /data/mysql/55555/55555-relay-bin.000002 Testing mysql connection and privileges.. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done.Sat Nov 4 00:57:05 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=mysql_slave2 --slave_ip=192.168.2.76 --slave_port=55555 --workdir=/var/tmp --target_version=5.5.57-log --manager_version=0.55 --relay_log_info=/data/mysql/55555/relay-log.info --relay_dir=/data/mysql/55555/ --slave_pass=xxxSat Nov 4 00:57:05 2017 - [info] Connecting to root@192.168.2.76(mysql_slave2:22).. Checking slave recovery environment settings.. Opening /data/mysql/55555/relay-log.info ... ok. Relay log found at /data/mysql/55555, up to 55555-relay-bin.000002 Temporary relay log file is /data/mysql/55555/55555-relay-bin.000002 Testing mysql connection and privileges.. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done.Sat Nov 4 00:57:05 2017 - [info] Slaves settings check done.Sat Nov 4 00:57:05 2017 - [info] mysql_master (current master) +--mysql_slave1 +--mysql_slave2Sat Nov 4 00:57:05 2017 - [info] Checking replication health on mysql_slave1..Sat Nov 4 00:57:05 2017 - [info] ok.Sat Nov 4 00:57:05 2017 - [info] Checking replication health on mysql_slave2..Sat Nov 4 00:57:05 2017 - [info] ok.Sat Nov 4 00:57:05 2017 - [warning] master_ip_failover_script is not defined.Sat Nov 4 00:57:05 2017 - [warning] shutdown_script is not defined.Sat Nov 4 00:57:05 2017 - [info] Got exit code 0 (Not master dead).MySQL Replication Health is OK.[root@mysql_slave2 conf]#满足上面的条件下,现在就开始使用开启manager了
[root@mysql_slave2 conf]# masterha_manager --conf=/etc/mha/conf/app1.cnf --remove_dead_master_conf --ignore_last_failover ## 前台启动[root@mysql_slave2 ~]# masterha_check_status --conf=/etc/mha/conf/app1.cnfapp1 (pid:2575) is running(0:PING_OK), master:mysql_master #说明启动OK[root@mysql_slave2 ~]#6. 配置VIP
在/etc/mha/conf/app1.cnf中 server default中加入一行配置 master_ip_failover_script=/etc/mha/conf/master_ip_failovercat /etc/mha/conf/master_ip_failover#!/usr/bin/env perl# Copyright (C) 2011 DeNA Co.,Ltd.## This program is free software; you can redistribute it and/or modify# it under the terms of the GNU General Public License as published by# the Free Software Foundation; either version 2 of the License, or# (at your option) any later version.## This program is distributed in the hope that it will be useful,# but WITHOUT ANY WARRANTY; without even the implied warranty of# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the# GNU General Public License for more details.## You should have received a copy of the GNU General Public License# along with this program; if not, write to the Free Software# Foundation, Inc.,# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA## Note: This is a sample script and is not complete. Modify the script based on your environment.use strict;use warnings FATAL => 'all';use Getopt::Long;use MHA::DBHelper;my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password);my $vip = '192.168.2.199';my $gateway = '255.255.248.0';my $ssh_start_vip = "sudo ifconfig eth0:1 $vip;sudo arping -c 3 -I eth0 -s $vip $gateway";my $ssh_stop_vip = 'sudo ifconfig eth0:1 down';GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, 'new_master_user=s' => \$new_master_user, 'new_master_password=s' => \$new_master_password,);exit &main();sub main { if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { &stop_vip(); # updating global catalog, etc $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { &start_vip(); $exit_code = 0; }; if ($@) { warn $@; # If you want to continue failover, exit 10. exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the status of the script: ssh -t $ssh_user\@$orig_master_host \"$ssh_start_vip\"\n"; `ssh -t $ssh_user\@$orig_master_host \"$ssh_start_vip\"`; exit 0; } else { &usage(); exit 1; }}sub usage { print"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";}sub start_vip() { print "Checking the start of the script: ssh -t $ssh_user\@$new_master_host \"$ssh_start_vip\"\n"; `ssh -t $ssh_user\@$new_master_host \"$ssh_start_vip\"`;}sub stop_vip() { print "Checking the stop/stopssh of the script: ssh -t $ssh_user\@$orig_master_host \"$ssh_stop_vip\"\n"; `ssh -t $ssh_user\@$orig_master_host \"$ssh_stop_vip\"`;}重启managrer
masterha_stop --conf=/etc/mha/conf/app1.cnfmasterha_manager --conf=/etc/mha/conf/app1.cnf --remove_dead_master_conf --ignore_last_failover查看 mysql_master 看到vip已经在master上了
[root@mysql_master ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:74:4D:FB inet addr:192.168.2.74 Bcast:192.168.7.255 Mask:255.255.248.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:108407 errors:0 dropped:0 overruns:0 frame:0 TX packets:7780 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:18041186 (17.2 MiB) TX bytes:789173 (770.6 KiB)eth0:1 Link encap:Ethernet HWaddr 00:0C:29:74:4D:FB inet addr:192.168.2.199 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1eth2 Link encap:Ethernet HWaddr 00:0C:29:74:4D:05 inet addr:172.16.16.1 Bcast:172.16.16.15 Mask:255.255.255.240 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:96363 errors:0 dropped:0 overruns:0 frame:0 TX packets:220 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:6307546 (6.0 MiB) TX bytes:9240 (9.0 KiB)lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:84 errors:0 dropped:0 overruns:0 frame:0 TX packets:84 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:14900 (14.5 KiB) TX bytes:14900 (14.5 KiB)[root@mysql_master ~]#
7.测试
注:上面我的mha-manager是使用的screen启动的。
现在将master停掉。观察 mha-maager日志/etc/mha/logs/manager.log、
----- Failover Report -----app1: MySQL Master failover mysql_master to mysql_slave1 succeededMaster mysql_master is down!Check MHA Manager logs at mysql_slave2:/etc/mha/logs/manager.log for details.Started automated(non-interactive) failover.Invalidated master IP address on mysql_master.The latest slave mysql_slave1(192.168.2.75:55555) has all relay logs for recovery.Selected mysql_slave1 as a new master.mysql_slave1: OK: Applying all logs succeeded.mysql_slave1: OK: Activated master IP address.mysql_slave2: This host has the latest relay log events.Generating relay diff files from the latest slave succeeded.mysql_slave2: OK: Applying all logs succeeded. Slave started, replicating from mysql_slave1.mysql_slave1: Resetting slave info succeeded.Master failover to mysql_slave1(192.168.2.75:55555) completed successfully.从日志中看到已经迁移到了 192.168.2.75(mysql_slave1)上.
[root@mysql_slave1 ~]# ifconfig eth0 Link encap:Ethernet HWaddr 00:0C:29:FB:B5:98 inet addr:192.168.2.75 Bcast:192.168.7.255 Mask:255.255.248.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:110418 errors:0 dropped:0 overruns:0 frame:0 TX packets:6705 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:18166949 (17.3 MiB) TX bytes:709681 (693.0 KiB)eth0:1 Link encap:Ethernet HWaddr 00:0C:29:FB:B5:98 inet addr:192.168.2.199 Bcast:192.168.2.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1eth2 Link encap:Ethernet HWaddr 00:0C:29:FB:B5:A2 inet addr:172.16.16.2 Bcast:172.16.16.15 Mask:255.255.255.240 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:100427 errors:0 dropped:0 overruns:0 frame:0 TX packets:230 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:6577954 (6.2 MiB) TX bytes:9660 (9.4 KiB)lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:201 errors:0 dropped:0 overruns:0 frame:0 TX packets:201 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:17985 (17.5 KiB) TX bytes:17985 (17.5 KiB)[root@mysql_slave1 ~]#VIP已经飘逸过来了。此时在查看下manager的状态
[root@mysql_slave2 logs]# masterha_check_status --conf=/etc/mha/conf/app1.cnf app1 is stopped(2:NOT_RUNNING).[root@mysql_slave2 logs]#果然看到manager已经停掉了,这个是screen启动的一个bug,解决这样的问题官方是建议使用daemontools这里就不详细介绍了。