如何将calico 2.6.1 更新成 3.11
说明
Calico是一个纯三层的协议,为OpenStack虚机和Docker容器提供多主机间通信。Calico不使用重叠网络比如flannel和libnetwork重叠网络驱动,
它是一个纯三层的方法,使用虚拟路由代替虚拟交换,每一台虚拟路由通过BGP协议传播可达信息(路由)到剩余数据中心。
查看官方文档升级的操作需要做如下注意事项。
- 2.6.x 与 3.x 使用的etcd(这里只是针对 etcd 存储来说) 是不同的,2.6 的使用的是 etcdv2, 而3.x 是 etcdv3.
- 如果想从 2.6.x 升级到 3.x 至少得是2.6.5+的才行。
所以针对现有的情况,需要先升级至 2.6.5+ ,再升级 3.x。
2.6.1 升级至 2.6.12
2019/12/25
现有环境,使用 etcdv2 进行存储的 calico 数据。
[root@k8s-1 kubelet]# which etcdv2alias etcdv2='export ETCDCTL_API=2; /bin/etcdctl --ca-file /etc/etcd/ssl/etcd-root-ca.pem --cert-file /etc/etcd/ssl/etcd.pem --key-file /etc/etcd/ssl/etcd-key.pem --endpoints https://10.111.32.239:2379,https://10.111.32.241:2379,https://10.111.32.242:2379'[root@k8s-1 kubelet]# etcdv2 ls /calico/ipam/v2/assignment/ipv4/calico/ipam/v2/assignment/ipv4/block[root@k8s-1 kubelet]# etcdv2 ls /calico/ipam/v2/assignment/ipv4/block/calico/ipam/v2/assignment/ipv4/block/10.20.134.64-26/calico/ipam/v2/assignment/ipv4/block/10.20.253.64-26/calico/ipam/v2/assignment/ipv4/block/10.20.28.192-26/calico/ipam/v2/assignment/ipv4/block/10.20.51.128-26/calico/ipam/v2/assignment/ipv4/block/10.20.78.0-26/calico/ipam/v2/assignment/ipv4/block/10.20.112.64-26/calico/ipam/v2/assignment/ipv4/block/10.20.15.128-26/calico/ipam/v2/assignment/ipv4/block/10.20.235.0-26/calico/ipam/v2/assignment/ipv4/block/10.20.53.64-26/calico/ipam/v2/assignment/ipv4/block/10.20.72.128-26
根据文档中的说明,升级至 3.0 需要至少 2.6.5+ ,且需要进行一些手动的操作,因为 3.x 的使用 etcdv3, 而 2.6.x 的使用 etcdv2。
现在集群使用的是 2.6.1 的版本,先将其升级至 2.6.5+。
这里选择 2.6 中最新的 2.6.12
下载 calico.yaml 文件
[root@docker-182 v2.6]# wget https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/rbac.yaml[root@docker-182 v2.6]# wget https://docs.projectcalico.org/v2.6/getting-started/kubernetes/installation/hosted/calico.yaml# 更改 calico.yaml 中的配置[root@docker-182 v2.6]# sh -x modify_calico_yaml.sh
预先拉取镜像
[root@docker-182 v2.6]# grep image calico.yaml image: quay.io/calico/node:v2.6.12 image: quay.io/calico/cni:v1.11.8 image: quay.io/calico/kube-controllers:v1.0.5 image: quay.io/calico/kube-controllers:v1.0.5
文档中说的一些升级步骤,比如先升级 calico-kube-controllers ,再升级 calico-node 的daemonset ,这里就直接 apply 新的资源文件
并不包含 calico 的 rbac 资源。
[root@docker-182 v2.6]# k239 apply -f calico.yamlconfigmap "calico-config" unchangedsecret "calico-etcd-secrets" unchangeddaemonset "calico-node" configureddeployment "calico-kube-controllers" configureddeployment "calico-policy-controller" configuredserviceaccount "calico-kube-controllers" unchangedserviceaccount "calico-node" unchanged
提交更新
提交之后, daemonset 的 calico-node 并没有更新,现在删除 pod ,使其更新
[root@k8s-1 v2.6]# kubectl -n kube-system get pod -o wide |grep calicocalico-kube-controllers-6768b96c5f-rdbjp 1/1 Running 0 4m 10.111.32.243 k8s-4.geotmt.comcalico-node-45lnh 0/1 ContainerCreating 0 4h 10.111.32.241 k8s-2.geotmt.comcalico-node-49mq7 1/1 Running 1 5h 10.111.32.243 k8s-4.geotmt.comcalico-node-m86hr 1/1 Running 0 5h 10.111.32.244 k8s-5.geotmt.comcalico-node-mm5fz 0/1 ContainerCreating 0 4h 10.111.32.239 k8s-1.geotmt.comcalico-node-shrfw 1/1 Running 0 4h 10.111.32.242 k8s-3.geotmt.comcalico-node-xx8hk 1/1 Running 0 5h 10.111.32.245 k8s-6.geotmt.com
更新后的测试
其中一个的示例,新的 calico-node 其中有两个容器。
[root@k8s-1 v2.6]# kubectl -n kube-system get pod -o wide |grep calico |grep k8s-6calico-node-fj4t8 2/2 Running 0 25s 10.111.32.245 k8s-6.geotmt.com
测试 ping 其他节点的 pod 正常
bash-4.4# ip a1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever2: tunl0@NONE: mtu 1480 qdisc noop state DOWN qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.04: eth0@if30: mtu 1500 qdisc noqueue state UP link/ether 6e:20:a3:45:42:49 brd ff:ff:ff:ff:ff:ff inet 10.20.235.12/32 scope global eth0 valid_lft forever preferred_lft foreverbash-4.4# ping 10.20.15.135PING 10.20.15.135 (10.20.15.135): 56 data bytes64 bytes from 10.20.15.135: seq=0 ttl=62 time=1.133 ms64 bytes from 10.20.15.135: seq=1 ttl=62 time=0.631 ms
这个版本的仍需手动添加 toleration,以便在 master 节点上部署 pod。
升级至 2.6.12 完成。
2.6.12 升级至 3.0
- 升级前的注意事项
- You must first upgrade to Calico v2.6.5 (or a later v2.6.x release) before you can upgrade to Calico v3.0.12. (Important: Calico v2.6.5 was a special transitional release that included changes to enable upgrade to v3.0.1+; do not skip this step!)
- If you are using the etcd datastore, you should upgrade etcd to the latest stable v3 release.
上述两条都满足。
[root@k8s-1 net.d]# etcdctl versionetcdctl version: 3.3.11API version: 3.3
- etcd datastore upgrade steps
- Install and configure calico-upgrade
- Test the data migration and check for errors
- Migrate Calico data
- Upgrade Calico
安装配置 calico-upgrade
[root@docker-182 ansible]# wget https://github.com/projectcalico/calico-upgrade/releases/download/v1.0.5/calico-upgrade[root@docker-182 k8s_239]# ansible-playbook install_calico-upgrade.yml
使用 dry-run 执行测试
[root@k8s-1 calico-upgrade]# calico-upgrade dry-run --output-dir=tmp --apiconfigv1 /etc/calico/apiconfigv1.cfg --apiconfigv3 /etc/calico/apiconfigv3.cfg
执行升级
[root@k8s-1 calico-upgrade]# calico-upgrade start --ignore-v3-data --apiconfigv1 /etc/calico/apiconfigv1.cfg --apiconfigv3 /etc/calico/apiconfigv3.cfgPreparing reports directory * creating report directory if it does not exist * validating permissions and removing old reportsChecking Calico version is suitable for migration * determined Calico version of: v2.6.12 * the v1 API data can be migrated to the v3 APIValidating conversion of v1 data to v3 * handling FelixConfiguration (global) resource * handling ClusterInformation (global) resource * handling FelixConfiguration (per-node) resources * handling BGPConfiguration (global) resource * handling Node resources * handling BGPPeer (global) resources * handling BGPPeer (node) resources * handling HostEndpoint resources * handling IPPool resources * handling GlobalNetworkPolicy resources * handling Profile resources * handling WorkloadEndpoint resources * data conversion successfulData conversion validated successfullyValidating the v3 datastore * the v3 datastore is not empty-------------------------------------------------------------------------------Successfully validated v1 to v3 conversion.You are about to start the migration of Calico v1 data format to Calico v3 dataformat. During this time and until the upgrade is completed Calico networkingwill be paused - which means no new Calico networked endpoints can be created.No Calico configuration should be modified using calicoctl during this time.Type "yes" to proceed (any other input cancels): yesPausing Calico networking * successfully paused Calico networking in the v1 configurationCalico networking is now paused - waiting for 15sQuerying current v1 snapshot and converting to v3 * handling FelixConfiguration (global) resource * handling ClusterInformation (global) resource * handling FelixConfiguration (per-node) resources * handling BGPConfiguration (global) resource * handling Node resources * handling BGPPeer (global) resources * handling BGPPeer (node) resources * handling HostEndpoint resources * handling IPPool resources * handling GlobalNetworkPolicy resources * handling Profile resources * handling WorkloadEndpoint resources * data converted successfullyStoring v3 data * Storing resources in v3 format * success: resources stored in v3 datastoreMigrating IPAM data * listing and converting IPAM allocation blocks * listing and converting IPAM affinity blocks * listing IPAM handles * storing IPAM data in v3 format * IPAM data migrated successfullyData migration from v1 to v3 successful * check the output for details of the migrated resources * continue by upgrading your calico/node versions to Calico v3.x-------------------------------------------------------------------------------Successfully migrated Calico v1 data to v3 format.Follow the detailed upgrade instructions available in the release documentationto complete the upgrade. This includes: * upgrading your calico/node instances and orchestrator plugins (e.g. CNI) to the required v3.x release * running 'calico-upgrade complete' to complete the upgrade and resume Calico networkingSee report(s) below for details of the migrated data.Reports:- name conversion: /root/calico-upgrade/calico-upgrade-report/convertednames
下载 v3.0 资源文件
[root@docker-182 v3.0]# wget https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/rbac.yaml[root@docker-182 v3.0]# wget https://docs.projectcalico.org/v3.0/getting-started/kubernetes/installation/hosted/calico.yaml
3.0的改变可参考 3.0release note
预先下载所需镜像
[root@docker-182 v3.0]# grep image calico.yaml image: quay.io/calico/node:v3.0.12 image: quay.io/calico/cni:v3.0.12 image: quay.io/calico/kube-controllers:v3.0.12
执行升级
[root@docker-182 v3.0]# k239 apply -f calico.yamlconfigmap "calico-config" configuredsecret "calico-etcd-secrets" unchangeddaemonset "calico-node" configureddeployment "calico-kube-controllers" configuredserviceaccount "calico-kube-controllers" unchangedserviceaccount "calico-node" unchanged
这里的 pod 可以实现滚动重启,待pod 都升级完成后。
执行 calico-upgrade 命令确定升级完成
[root@k8s-1 calico-upgrade]# calico-upgrade complete --apiconfigv1 /etc/calico/apiconfigv1.cfg --apiconfigv3 /etc/calico/apiconfigv3.cfgYou are about to complete the upgrade process to Calico v3. At this point, thev1 format data should have been successfully converted to v3 format, and allcalico/node instances and orchestrator plugins (e.g. CNI) should be runningCalico v3.x.Type "yes" to proceed (any other input cancels): yesCompleting upgradeEnabling Calico networking for v3 * successfully resumed Calico networking in the v3 configuration (updated ClusterInformation)Upgrade completed successfully-------------------------------------------------------------------------------Successfully completed the upgrade process.
如不执行上述命令,会有如下报错
E1225 19:56:04.837028 3281 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "demo-deployment-6f4c6779b-b8zqq_default" network: Calico is currently not ready to process requestsE1225 19:56:04.837049 3281 kuberuntime_manager.go:647] createPodSandbox for pod "demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "demo-deployment-6f4c6779b-b8zqq_default" network: Calico is currently not ready to process requestsE1225 19:56:04.837167 3281 pod_workers.go:186] Error syncing pod 1dd28cf0-270d-11ea-bd6c-c6a864ab864a ("demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)"), skipping: failed to "CreatePodSandbox" for "demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)" with CreatePodSandboxError: "CreatePodSandbox for pod \"demo-deployment-6f4c6779b-b8zqq_default(1dd28cf0-270d-11ea-bd6c-c6a864ab864a)\" failed: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod \"demo-deployment-6f4c6779b-b8zqq_default\" network: Calico is currently not ready to process requests"
升级至 3.0.12 成功。
3.0.12 升级至 3.11
根据 3.11 的 Upgrading Calico on Kubernetes
说明。升级时,只需要提交新的资源文件即可(本环境不涉及 Application Layer Policy
)。
这个版本的 calico 已经可以完整支持 k8s api 的datastore, 更新时要注意下载文件时是否与自己的环境契合。
本环境下载 etcd datastore 的版本。
下载资源文件
[root@docker-182 v3.11]# wget https://docs.projectcalico.org/v3.11/manifests/calico-etcd.yaml# 修改其中关于 etcd 的配置[root@docker-182 v3.11]# bash -x modify_calico_yaml.sh
预先下载镜像
[root@docker-182 v3.11]# grep image calico-etcd.yaml image: calico/cni:v3.11.1 image: calico/pod2daemon-flexvol:v3.11.1 image: calico/node:v3.11.1 image: calico/kube-controllers:v3.11.1
提交新版本
[root@docker-182 v3.11]# k239 apply -f calico-etcd.yamlsecret "calico-etcd-secrets" unchangedconfigmap "calico-config" configuredclusterrole "calico-kube-controllers" configuredclusterrolebinding "calico-kube-controllers" configuredclusterrole "calico-node" configuredclusterrolebinding "calico-node" configureddaemonset "calico-node" configuredserviceaccount "calico-node" unchangeddeployment "calico-kube-controllers" configuredserviceaccount "calico-kube-controllers" unchanged
验证新版本
查看新版本的 pod, 每个 pod 内只有一个容器,这个版本的将 install-cni 和 flexvol-driver(旧版本没有) 作为了 initContainers
,所以常驻的就只有一个容器了
[root@docker-182 ~]# k239 -n kube-system get pod -o wide |grep calicocalico-kube-controllers-85dc4fd46b-4wnmt 1/1 Running 0 1m 10.111.32.243 k8s-4.geotmt.comcalico-node-4bgkc 1/1 Running 0 59s 10.111.32.241 k8s-2.geotmt.comcalico-node-5jg2t 1/1 Running 0 31s 10.111.32.244 k8s-5.geotmt.comcalico-node-9fn6r 1/1 Running 0 43s 10.111.32.245 k8s-6.geotmt.comcalico-node-9n7dn 1/1 Running 0 1m 10.111.32.243 k8s-4.geotmt.comcalico-node-fxr46 1/1 Running 0 1m 10.111.32.239 k8s-1.geotmt.comcalico-node-pgh6c 1/1 Running 0 1m 10.111.32.242 k8s-3.geotmt.com
测试 pod 的跨主机通信
[root@k8s-1 ~]# kubectl exec -it demo-deployment-6f4c6779b-b8zqq /bin/bashbash-4.4# ping 10.20.235.12PING 10.20.235.12 (10.20.235.12): 56 data bytes64 bytes from 10.20.235.12: seq=0 ttl=62 time=1.232 ms^C--- 10.20.235.12 ping statistics ---1 packets transmitted, 1 packets received, 0% packet lossround-trip min/avg/max = 1.232/1.232/1.232 msbash-4.4# ping 10.20.253.80PING 10.20.253.80 (10.20.253.80): 56 data bytes64 bytes from 10.20.253.80: seq=0 ttl=62 time=1.730 ms64 bytes from 10.20.253.80: seq=1 ttl=62 time=1.385 ms^C--- 10.20.253.80 ping statistics ---2 packets transmitted, 2 packets received, 0% packet lossround-trip min/avg/max = 1.385/1.557/1.730 msbash-4.4# ip a1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever2: tunl0@NONE: mtu 1480 qdisc noop state DOWN qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.04: eth0@if51: mtu 1500 qdisc noqueue state UP link/ether fa:d1:55:42:ab:6c brd ff:ff:ff:ff:ff:ff inet 10.20.15.163/32 scope global eth0 valid_lft forever preferred_lft forever
测试pod重建分配地址,成功
[root@k8s-1 ~]# kubectl delete pod nginx-deployment-7b66d98974-2rh87pod "nginx-deployment-7b66d98974-2rh87" deleted[root@k8s-1 ~]# kubectl get pod nginx-deployment-7b66d98974-nd8h7 -o wide NAME READY STATUS RESTARTS AGE IP NODEnginx-deployment-7b66d98974-nd8h7 1/1 Running 0 1m 10.20.253.86 k8s-4.geotmt.com
calico 3.0.12 升级至 3.11.1 成功。