MySQL高可用架构中MHA的本质以及如何部署
发表于:2025-01-21 作者:千家信息网编辑
千家信息网最后更新 2025年01月21日,这篇文章将为大家详细讲解有关MySQL高可用架构中MHA的本质以及如何部署,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。MySQL高可用架构之MHA1
千家信息网最后更新 2025年01月21日MySQL高可用架构中MHA的本质以及如何部署
这篇文章将为大家详细讲解有关MySQL高可用架构中MHA的本质以及如何部署,文章内容质量较高,因此小编分享给大家做个参考,希望大家阅读完这篇文章后对相关知识有一定的了解。
MySQL高可用架构之MHA
1、关于MHA
MHA(Master HA)是一款开源的MySQL的高可用程序,它为MySQL主从复制架构提供了automating master failover功能。MHA在监控到master节点故障时,会提升其中拥有的最新数据的slave节点成为新的master节点,在此期间,MHA会通过其它从节点获取额外信息来避免一致性方面的问题。MHA还提供了master节点的在线切换功能,即按需切换master/slave节点。
MHA服务有两种角色,MHA Manager(管理节点)和MHA Node(数据节点):
MHA Manager:通常单独部署在一台独立机器上管理多个master/slave集群,每个master/slave集群称为一个application;MHA node:运行在每台MySQL服务器上,它通过监控具备解析和清理log功能的脚本来加快故障转移
2、MHA组件说明
Manager节点:
-masterha_check_ssh:MHA依赖的SSH环境检测工具;-masterha_check_repl:MySQL复制环境检测工具;-masterha_manager:MHA服务主程序;-masterha_check_status:MHA运行状态探测工具;-masterha_master_monitor:MySQL master节点可用性检测工具;-masterha_master_switch:master节点切换工具;-masterha_conf_host:添加或删除配置的节点;-masterha_stop:关闭MHA服务的工具;
Node节点:
-save_binary_logs:保存和复制master的二进制日志;-apply_diff_relay_logs:识别差异的中继日志事件并用于其他slave;-fiter_mysqlbinlog:去除不必要的ROLLBACK事件(MHA已不再使用这个工具);-purge_relay_logs:清除中继日志(不会阻塞SQL线程);
自定义扩展:
-secondary_check_script:通过多条网络路由检测master的可用性;-master_ip_failover_script:更新appliction使用的masterip;-shutdown_script:强制关闭master节点;-report_script:发送报告;-init_conf_load_script:加载初始配置参数;-master_ip_online_change_script:更新master节点ip地址
3、部署及测试
实验拓扑:
node1:192.168.150.137 MHA managernode2:192.168.150.138 MHA node mariadb masternode3:192.168.150.139 MHA node mariadb slave candidatenode4:192.168.150.140 MHA node mariadb slave
配置过程:
1、修改每台服务器的hosts文件/etc/hosts127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4::1 localhost localhost.localdomain localhost6 localhost6.localdomain6192.168.150.137 node1.com node1192.168.150.138 node2.com node2192.168.150.139 node3.com node3192.168.150.140 node4.com node42、node2-node4进行mariadb的yum安装yum -y install mariadb-server3、配置ssh互信通信环境[root@node1 ~]# ssh-keygen -t rsa -P ''Generating public/private rsa key pair.Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'.Your identification has been saved in /root/.ssh/id_rsa.Your public key has been saved in /root/.ssh/id_rsa.pub.The key fingerprint is:a2:f2:10:28:cd:ea:7b:d8:f4:95:15:6e:73:a6:9d:4e root@node1.comThe key's randomart p_w_picpath is:+--[ RSA 2048]----+| || . || . . || + = o ||o + . S * . ||.. o . + . E ||. * o . o ||.. * . . || oo . |+-----------------+[root@node1 ~]# ls .ssh/id_rsaid_rsa id_rsa.pub [root@node1 ~]# cat .ssh/id_rsa.pub > .ssh/authorized_keys[root@node1 ~]# ssh node1The authenticity of host 'node1 (192.168.150.137)' can't be established.ECDSA key fingerprint is 2a:e3:03:52:8c:84:02:59:a2:26:a3:b2:f6:74:6c:3c.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node1,192.168.150.137' (ECDSA) to the list of known hosts.Last login: Wed Mar 29 15:06:52 2017 from 192.168.150.1[root@node1 ~]# ll .ssh/authorized_keys -rw-r--r-- 1 root root 396 3月 29 15:35 .ssh/authorized_keys[root@node1 ~]# chmod go= .ssh/authorized_keys [root@node1 ~]# scp -p .ssh/id_rsa .ssh/authorized_keys node2:/root/.sshroot@node2's password: id_rsa 100% 1675 1.6KB/s 00:00 authorized_keys 100% 396 0.4KB/s 00:00 [root@node1 ~]# scp -p .ssh/id_rsa .ssh/authorized_keys node3:/root/.sshThe authenticity of host 'node3 (192.168.150.139)' can't be established.ECDSA key fingerprint is 2a:e3:03:52:8c:84:02:59:a2:26:a3:b2:f6:74:6c:3c.Are you sure you want to continue connecting (yes/no)? yesWarning: Permanently added 'node3,192.168.150.139' (ECDSA) to the list of known hosts.root@node3's password: id_rsa 100% 1675 1.6KB/s 00:00 authorized_keys 100% 396 0.4KB/s 00:00 [root@node1 ~]# scp -p .ssh/id_rsa .ssh/authorized_keys node4:/root/.sshroot@node4's password: Permission denied, please try again.root@node4's password: id_rsa 100% 1675 1.6KB/s 00:00 authorized_keys 100% 396 0.4KB/s 00:00 [root@node1 ~]# ssh node2Last login: Wed Mar 29 15:07:05 2017 from 192.168.150.1[root@node2 ~]# exit登出Connection to node2 closed.[root@node1 ~]# ssh node3Last login: Wed Mar 29 15:07:18 2017 from 192.168.150.1[root@node3 ~]# exit登出Connection to node3 closed.[root@node1 ~]# ssh node3Last login: Wed Mar 29 15:40:05 2017 from node1.com[root@node3 ~]# exit登出Connection to node3 closed.[root@node1 ~]# ssh node4Last failed login: Wed Mar 29 15:39:53 CST 2017 from node1.com on ssh:nottyThere was 1 failed login attempt since the last successful login.Last login: Wed Mar 29 15:39:30 2017 from node1.com[root@node4 ~]# exit登出Connection to node4 closed.[root@node1 ~]# ssh node4Last login: Wed Mar 29 15:40:13 2017 from node1.com[root@node4 ~]# exit登出Connection to node4 closed.4、修改mysql参数master:[mysqld]innodb_file_per_table = 1skip_name_resolve = 1log-bin = master-binrelay-log = relay-binserver_id = 1slave:[mysqld]innode_file_per_table = 1skip_name_resolve = 1log-bin = master-binrelay-log = relay-binserver_id = 2read_only = 1relay_log_purge = 05、主库开启并创建授权账号master:MariaDB [(none)]> SHOW MASTER STATUS;+-------------------+----------+--------------+------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |+-------------------+----------+--------------+------------------+| master-bin.000003 | 245 | | |+-------------------+----------+--------------+------------------+1 row in set (0.00 sec)MariaDB [(none)]> GRANT REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'repluser'@'192.168.%.%' IDENTIFIED BY 'replpass';MariaDB [(none)]> GRANT ALL ON *.* TO 'mhauser'@'192.168.%.%' IDENTIFIED BY 'mhapass'; #此为mha的管理账号MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'read_only'; #此时主库是可写可读+---------------+-------+| Variable_name | Value |+---------------+-------+| read_only | OFF |+---------------+-------+1 row in set (0.00 sec)6、从库进行主从功能开启 node3、node4操作相同[root@node3 ~]# mysqlWelcome to the MariaDB monitor. Commands end with ; or \g.Your MariaDB connection id is 2Server version: 5.5.52-MariaDB MariaDB ServerCopyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='192.168.150.138',MASTER_USER='repluser',MSTER_PASSWORD='replpass',MASTER_LOG_FILE='master-bin.000003',MASTER_LOG_POS=245;Query OK, 0 rows affected (0.01 sec)MariaDB [(none)]> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Master_Host: 192.168.150.138 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 245 Relay_Log_File: relay-bin.000001 Relay_Log_Pos: 4 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 245 Relay_Log_Space: 245 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 01 row in set (0.00 sec)MariaDB [(none)]> START SLAVE;Query OK, 0 rows affected (0.00 sec)MariaDB [(none)]> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.150.138 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 497 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 782 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 497 Relay_Log_Space: 1070 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 11 row in set (0.00 sec)MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'read_only'; #此时从库是只读模式+---------------+-------+| Variable_name | Value |+---------------+-------+| read_only | ON |+---------------+-------+1 row in set (0.00 sec)7、此时一主两从架构已经配置完成,安装MHA包manager节点(node1):安装mha4mysql-manager-0.56-0.el6.noarch.rpm和mha4mysql-node-0.56-0.el6.noarch.rpmyum install mha4mysql* -ynode节点(node2-node4):安装mha4mysql-node-0.56-0.el6.noarch.rpmyum -y iinstall mha4mysql-node-0.56-0.el6.noarch.rpm8、初始化MHA创建配置目录及配置文件(在node1上执行)[root@node1 ~]# mkdir /etc/masterha[root@node1 ~]# vim /etc/masterha/app1.cnf [server default]user=mhauserpassword=mhapassmanager_workdir=/data/masterha/app1master_log=/data/masterha/app1/manager.logremote_workdir=/data/masterha/app1ssh_user=rootrepl_user=repluserrepl_password=replpassping_interval=1[server1]hostname=192.168.150.138candidate_master=1[server2]hostname=192.168.150.139candidate_master=1[server3]hostname=192.168.150.1409、启动前检测ssh互信配置是否OK[root@node1 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf......Warning: Permanently added '192.168.150.139' (ECDSA) to the list of known hosts.Wed Mar 29 17:03:03 2017 - [debug] ok.Wed Mar 29 17:03:03 2017 - [info] All SSH connection tests passed successfully.mysql复制集群的连接配置是否OK[root@node1 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf......Wed Mar 29 17:04:40 2017 - [info] 192.168.150.138(192.168.150.138:3306) (current master) +--192.168.150.139(192.168.150.139:3306) +--192.168.150.140(192.168.150.140:3306)Wed Mar 29 17:04:40 2017 - [info] Checking replication health on 192.168.150.139..Wed Mar 29 17:04:40 2017 - [info] ok.Wed Mar 29 17:04:40 2017 - [info] Checking replication health on 192.168.150.140..Wed Mar 29 17:04:40 2017 - [info] ok.Wed Mar 29 17:04:40 2017 - [warning] master_ip_failover_script is not defined.Wed Mar 29 17:04:40 2017 - [warning] shutdown_script is not defined.Wed Mar 29 17:04:40 2017 - [info] Got exit code 0 (Not master dead).MySQL Replication Health is OK.10、启动MHA[root@node1 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /data/masterha/app1/manager.log 2>&1 &[1] 16989[root@node1 ~]# tail -f /data/masterha/app1/manager.log 192.168.150.138(192.168.150.138:3306) (current master) +--192.168.150.139(192.168.150.139:3306) +--192.168.150.140(192.168.150.140:3306)Wed Mar 29 21:51:58 2017 - [warning] master_ip_failover_script is not defined.Wed Mar 29 21:51:58 2017 - [warning] shutdown_script is not defined.Wed Mar 29 21:51:58 2017 - [info] Set master ping interval 1 seconds.Wed Mar 29 21:51:58 2017 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.Wed Mar 29 21:51:58 2017 - [info] Starting ping health check on 192.168.150.138(192.168.150.138:3306)..Wed Mar 29 21:51:58 2017 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..11、启动后查看master节点状态[root@node1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 (pid:16623) is running(0:PING_OK), master:192.168.150.13812、进行故障转移测试(1)maser节点关闭mariadb[root@node2 ~]# killall mysqld mysqld_safe(2)此时在manager上可以看到转移的日志----- Failover Report -----app1: MySQL Master failover 192.168.150.138(192.168.150.138:3306) to 192.168.150.139(192.168.150.139:3306) succeededMaster 192.168.150.138(192.168.150.138:3306) is down!Check MHA Manager logs at node1.com:/data/masterha/app1/manager.log for details.Started automated(non-interactive) failover.The latest slave 192.168.150.139(192.168.150.139:3306) has all relay logs for recovery.Selected 192.168.150.139(192.168.150.139:3306) as a new master.192.168.150.139(192.168.150.139:3306): OK: Applying all logs succeeded.192.168.150.140(192.168.150.140:3306): This host has the latest relay log events.Generating relay diff files from the latest slave succeeded.192.168.150.140(192.168.150.140:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.150.139(192.168.150.139:3306)192.168.150.139(192.168.150.139:3306): Resetting slave info succeeded.Master failover to 192.168.150.139(192.168.150.139:3306) completed successfully.故障转移后,manager会自动停止,此时查看master状态[root@node1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 is stopped(2:NOT_RUNNING).(3)查看其它两个库状态node3 已成功接管master,并可读写MariaDB [(none)]> SHOW MASTER STATUS;+-------------------+----------+--------------+------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |+-------------------+----------+--------------+------------------+| master-bin.000003 | 245 | | |+-------------------+----------+--------------+------------------+1 row in set (0.00 sec)MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'read_only';+---------------+-------+| Variable_name | Value |+---------------+-------+| read_only | OFF |+---------------+-------+1 row in set (0.00 sec)node4MariaDB [(none)]> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.150.139 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 245 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 530 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 245 Relay_Log_Space: 818 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 21 row in set (0.00 sec)MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'read_only';+---------------+-------+| Variable_name | Value |+---------------+-------+| read_only | ON |+---------------+-------+1 row in set (0.00 sec)(4)提供新的从节点已修复复制集群master界定啊故障后,需要重新准备好一个新的MySQL节点。基于来自于master节点的备份恢复后,将其重新配置为mster的从节点即可。新加入节点IP为原master节点IP,否则还得修改appl.cnf中相应的设置,最后再次启动manager,并再次检查状态。[root@node2 ~]# rm -rf /var/lib/mysql/*[root@node2 ~]# vim /etc/my.cnf添加从库两选项read_only = 1relay_log_purge = 0[root@node2 ~]# systemctl start mariadb.service[root@node2 ~]# mysqlWelcome to the MariaDB monitor. Commands end with ; or \g.Your MariaDB connection id is 2Server version: 5.5.52-MariaDB MariaDB ServerCopyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.MariaDB [(none)]> GRANT REPLICATION SLAVE,REPLICATION CLIENT ON *.* TO 'repluser'@'192.168.%.%' IDENTIFIED BY 'replpass';Query OK, 0 rows affected (0.00 sec)MariaDB [(none)]> GRANT ALL ON *.* TO 'mhauser'@'192.168.%.%' IDENTIFIED BY 'mhapass';Query OK, 0 rows affected (0.00 sec)MariaDB [(none)]> FLUSH PRIVILEGES;Query OK, 0 rows affected (0.00 sec)MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST='192.168.150.139',MASTER_USER='repluser',MASTER_PASSWORD='replpass',MASTER_LOG_FILE='master-bin.000003',MASTER_LOG_POS=245;Query OK, 0 rows affected (0.01 sec)MariaDB [(none)]> SHOW SLAVE\GERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '' at line 1MariaDB [(none)]> START SLAVE;Query OK, 0 rows affected (0.01 sec)MariaDB [(none)]> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.150.139 Master_User: repluser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000003 Read_Master_Log_Pos: 245 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 530 Relay_Master_Log_File: master-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 245 Relay_Log_Space: 818 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 21 row in set (0.00 sec)再次开启manager查看状态,状态全部OK,主库变更为node3[root@node1 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf > /data/masterha/app1/manager.log 2>&1 &[root@node1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnfapp1 (pid:17229) is running(0:PING_OK), master:192.168.150.139
补充:
提供额外机制,防止对master的监控做出误判、VIP添加、在进行故障转移时对原有master节点执行STONITH操作避免脑裂,可通过shutdown_scrip实现、必要时,进行在线master节点转换;
关于MySQL高可用架构中MHA的本质以及如何部署就分享到这里了,希望以上内容可以对大家有一定的帮助,可以学到更多知识。如果觉得文章不错,可以把它分享出去让更多的人看到。
节点
配置
工具
状态
故障
架构
服务
检测
功能
日志
集群
再次
环境
切换
监控
管理
本质
主从
事件
内容
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
互联网科技智能手表
北京软件开发设施价格优惠
无锡软件开发产业化
网络安全在北京工作二年工资
苏州万度网络技术有限公司
装数据库的时候密码是什么
嘉讯软件开发
网络技术应用理论
什么是数据库dba权限
申请网络安全审查
尼基塔服务器在哪
删除服务器如何找回
网络技术主管能力
连接数据库的jar包怎么弄
数据库双机热备份和冷备份
小型数据库使用方法
python实现网络安全协议
中宾网络技术官网
广西网络安全信息报警网站
mysql数据库打开工具
sftp服务器搭建
深圳易思博软件开发有限公司
如何学习数据库与算法
深圳达众互联网科技
国产数据库是硬件还是软件
数据库数据
数据库系统释义
广西网络安全信息报警网站
cvi ado 数据库
网络技术公司通用章程