etcd如何去掉坏掉的节点,添加新节点
etcd运行了有一段时间,最近发现etcd报警,发现一台etcd的容器损坏,但是etcd没有自动将该节点去除;同时k8s添加了一个节点,但是该节点有加入集群,需要重新添加。
[root@A01-R04-I69-122 ~]# etcdctl --endpoints="http://etcd-xxxl:4001" cluster-health
failed to check the health of member ec292d985b723e4 on http://10.187.27.196:4001: Get http://10.187.27.196:4001/health: dial tcp 10.187.27.196:4001: getsockopt: connection timed out
member ec292d985b723e4 is unreachable: [http://10.187.27.196:4001] are all unreachable
member 2727dc2f519c6794 is healthy: got healthy result from http://10.185.243.35:4001
member 4e56d8229082190f is healthy: got healthy result from http://10.187.24.132:4001
member 5780625be722ce57 is healthy: got healthy result from http://10.187.24.134:4001
member ec2aa2bbe0c891b7 is healthy: got healthy result from http://10.187.27.200:4001
发现 10.187.27.196 损坏,记录下该节点的id ec292d985b723e4
执行删除命令:
etcdctl --endpoints="http://etcd-xxxl:4001" member remove ec292d985b723e4
Removed member ec292d985b723e4 from cluster
再次查看,损坏节点已经去掉。
etcdctl --endpoints="http://etcd-xxxxxxx:4001" cluster-health
member 2727dc2f519c6794 is healthy: got healthy result from http://10.185.243.35:4001
member 4e56d8229082190f is healthy: got healthy result from http://10.187.24.132:4001
member 5780625be722ce57 is healthy: got healthy result from http://10.187.24.134:4001
member ec2aa2bbe0c891b7 is healthy: got healthy result from http://10.187.27.200:4001
对于新节点,首先听到etcd,将目录删除,然后重启,重启后,就会加入该集群。
另外,我们是通过k8s部署的etcd,其他方式,不适合,另外,极力推荐使用k8s的方式进行部署。