MySQL MHA切换失败一例
先看下引起问题的密码啥样, 包含两个特殊字符[和~.
$ egrep -w 'user|password' /etc/masterha/app1.cnf
password=P[AI3M~5z
user=mha_mgr
用户mha_mgr的作用, 如下文档中的说明, 可见其对数据库实例起到管理的作用.
MySQL administrative database username to the target MySQL server. This should be root because it runs all necessary administrative commands such as STOP SLAVE, CHANGE MASTER, RESET SLAVE.
看看两种场景下切换失败的现象.
1. 手动在线切换, 据日志可看到, MHA将密码中的特殊字符转义了, 致使用户mha_mgr连接数据库失败.
检查原有的主从复制状态, 发现无变化, 即切换失败, 并没造成影响.
$ masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --orig_master_is_new_slave
...
Thu Jan 11 10:28:39 2018 - [info] * Phase 2: Rejecting updates Phase..
Thu Jan 11 10:28:39 2018 - [info]
Thu Jan 11 10:28:39 2018 - [info] Executing master ip online change script to disable write on the current master:
Thu Jan 11 10:28:39 2018 - [info] /usr/local/bin/master_ip_online_change --command=stop --orig_master_host=192.168.4.42 --orig_master_ip=192.168.4.42 --orig_master_port=3306 --orig_master_user='mha_mgr' --orig_master_password='P\[AI3M\~5z' --new_master_host=192.168.4.43 --new_master_ip=192.168.4.43 --new_master_port=3306 --new_master_user='mha_mgr' --new_master_password='P\[AI3M\~5z' --orig_master_ssh_user=mysql --new_master_ssh_user=mysql --orig_master_is_new_slave
ARGS: $VAR1 = [
'--command=stop',
'--orig_master_host=192.168.4.42',
'--orig_master_ip=192.168.4.42',
'--orig_master_port=3306',
'--orig_master_user=mha_mgr',
'--orig_master_password=P\\[AI3M\\~5z',
'--new_master_host=192.168.4.43',
'--new_master_ip=192.168.4.43',
'--new_master_port=3306',
'--new_master_user=mha_mgr',
'--new_master_password=P\\[AI3M\\~5z',
'--orig_master_ssh_user=mysql',
'--new_master_ssh_user=mysql',
'--orig_master_is_new_slave'
];
Unknown option: orig_master_ssh_user
Unknown option: new_master_ssh_user
Unknown option: orig_master_is_new_slave
Got Error: DBI connect(';host=192.168.4.43;port=3306;mysql_connect_timeout=4','mha_mgr',...) failed: Access denied for user 'mha_mgr'@'192.168.4.45' (using password: YES) at /usr/local/share/perl5/MHA/DBHelper.pm line 205.
at /usr/local/bin/master_ip_online_change line 132.
Thu Jan 11 10:28:39 2018 - [error][/usr/local/share/perl5/MHA/ManagerUtil.pm, ln177] Got ERROR: at /usr/local/bin/masterha_master_switch line 53.
2. 自动故障切换(手动关闭主库, 模拟故障), 日志最后出现了"completed successfully"的字样, 貌似没问题, 但详细查看日志, 发现一处和上面类似的报错, 见标黄处.
检查原有的复制状态, 发现复制关系切换正常, 但是写虚拟IP不见了, 即不可写了.
$ tail -f /var/log/masterha/manager.log
...
Fri Jan 12 10:33:55 2018 - [info] Executing master IP activate script:
Fri Jan 12 10:33:55 2018 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=mysql --orig_master_host=192.168.4.42 --orig_master_ip=192.168.4.42 --orig_master_port=3306 --new_master_host=192.168.4.43 --new_master_ip=192.168.4.43 --new_master_port=3306 --new_master_user='mha_mgr' --new_master_password='P\[AI3M\~5z'
DBI connect(';host=192.168.4.43;port=3306;mysql_connect_timeout=4','mha_mgr',...) failed: Access denied for user 'mha_mgr'@'192.168.4.45' (using password: YES) at /usr/local/share/perl5/MHA/DBHelper.pm line 205.
at /usr/local/bin/master_ip_failover line 81.
Fri Jan 12 10:33:55 2018 - [error][/usr/local/share/perl5/MHA/MasterFailover.pm, ln1588] Failed to activate master IP address for 192.168.4.43(192.168.4.43:3306) with return code 10:0
Fri Jan 12 10:33:55 2018 - [warning] Proceeding.
Fri Jan 12 10:33:55 2018 - [info] Setting read_only=0 on 192.168.4.43(192.168.4.43:3306)..
Fri Jan 12 10:33:55 2018 - [info] ok.
...
Master failover to 192.168.4.43(192.168.4.43:3306) completed successfully.
Fri Jan 12 10:33:57 2018 - [info] Sending mail..
解决方法, 换个比较正常的密码就行了. 至于修复这个小缺陷, 还要仔细查下代码.
免责声明:
① 本站未注明“稿件来源”的信息均来自网络整理。其文字、图片和音视频稿件的所属权归原作者所有。本站收集整理出于非商业性的教育和科研之目的,并不意味着本站赞同其观点或证实其内容的真实性。仅作为临时的测试数据,供内部测试之用。本站并未授权任何人以任何方式主动获取本站任何信息。
② 本站未注明“稿件来源”的临时测试数据将在测试完成后最终做删除处理。有问题或投稿请发送至: 邮箱/279061341@qq.com QQ/279061341