利用Kubeadm创建高可用集群

  • 使用具有堆叠的控制平面节点。这种方法所需基础设施较少。etcd 成员和控制平面节点位于同一位置。
  • 使用外部 etcd 集群。这种方法所需基础设施较多。控制平面的节点和 etcd 成员是分开的。
    在下一步之前,你应该仔细考虑哪种方法更好地满足你的应用程序和环境的需求。 高可用拓扑选项 讲述了每种方法的优缺点。
  • 如何安装Kubectl和Kubeadm
  • 如何安装外部的Etcd集群

参与主机列表

IPCPU内存硬盘角色
10.1.6.48816100control-plane1
10.1.6.24816100control-plane2
10.1.6.45816100control-plane3
10.1.6.46816100work1
10.1.6.43816100work2
10.1.6.47816100work3
10.1.6.2134420HA+KP1
10.1.6.2144420HA+KP2
10.1.6.215Load_Balancer_IP
10.1.6.51816100Etcd1
10.1.6.52816100Etcd2
10.1.6.53816100Etcd3

为Kube-apiserver创建负载均衡器

Keepalived 提供 VRRP 实现,并允许您配置 Linux 机器使负载均衡,预防单点故障。HAProxy 提供可靠、高性能的负载均衡,能与 Keepalived 完美配合。

由于 lb1 和 lb2 上安装了 Keepalived 和 HAproxy,如果其中一个节点故障,虚拟 IP 地址(即浮动 IP 地址)将自动与另一个节点关联,使集群仍然可以正常运行,从而实现高可用。若有需要,也可以此为目的,添加更多安装 Keepalived 和 HAproxy 的节点。

先运行以下命令安装 KeepalivedHAproxy

1
2
yum install keepalived haproxy psmisc -y
dnf install keepalived haproxy psmisc -y

在两台用于负载均衡的机器上运行以下命令以配置 Proxy(两台机器的 Proxy 配置相同):

1
vi /etc/haproxy/haproxy.cfg
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
global
log /dev/log local0 warning
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon

stats socket /var/lib/haproxy/stats

defaults
log global
option httplog
option dontlognull
timeout connect 5000
timeout client 50000
timeout server 50000

frontend kube-apiserver
bind *:8443
mode tcp
option tcplog
default_backend kube-apiserver

backend kube-apiserver
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server kube-apiserver-1 10.1.6.48:6443 check # Replace the IP address with your own.
server kube-apiserver-2 10.1.6.24:6443 check # Replace the IP address with your own.
server kube-apiserver-3 10.1.6.45:6443 check # Replace the IP address with your own.

启动Haproxy

请确保你的LB2也已经进行如上配置

1
2
systemctl restart haproxy
systemctl enable haproxy

配置Keepalived
KP1配置如下

1
vim /etc/keepalived/keepalived.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
global_defs {
notification_email {
}
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_garp_interval 0
vrrp_gna_interval 0
}

vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}

vrrp_instance haproxy-vip {
state BACKUP
priority 100
interface ens192 # 网卡名称
virtual_router_id 60
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 10.1.6.213 # The IP address of this machine
unicast_peer {
10.1.6.214 # The IP address of peer machines
}

virtual_ipaddress {
10.1.6.215/24 # 虚拟IP地址
}

track_script {
chk_haproxy
}
}

KP2配置如下

1
vim /etc/keepalived/keepalived.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
global_defs {
notification_email {
}
router_id LVS_DEVEL
vrrp_skip_check_adv_addr
vrrp_garp_interval 0
vrrp_gna_interval 0
}

vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}

vrrp_instance haproxy-vip {
state BACKUP
priority 90
interface ens192 # Network card
virtual_router_id 60
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
unicast_src_ip 10.1.6.214 # The IP address of this machine
unicast_peer {
10.1.6.214 # The IP address of peer machines
}

virtual_ipaddress {
10.1.6.215/24 # The VIP address
}

track_script {
chk_haproxy
}
}

请确保你的KP1KP2都完成了如上配置

1
2
systemctl start keepalived
systemctl enable keepalived

配置Master主机节点的Hosts文件

  • 所有的Master主机都需要进行配置,防止后续解析不到api.k8s.verbos.com

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    10.1.6.48 containerd-master1
    10.1.6.24 containerd-master2
    10.1.6.45 containerd-master3
    10.1.6.46 containerd-work1
    10.1.6.43 containerd-work2
    10.1.6.47 containerd-work3
    10.1.6.51 etcd1
    10.1.6.52 etcd2
    10.1.6.53 etcd3
    10.1.6.215 api.k8s.verbos.com

    设置Etcd集群证书

    如果你使用的是工作在Work节点的Etcd或者其他单独的Etcd集群,请将Etcd的ca证书进行拷贝到Master节点当中.

    1
    2
    3
    4
    export CONTROL_PLANE="root@10.1.6.48"
    scp /etc/kubernetes/pki/etcd/ca.crt "${CONTROL_PLANE}":
    scp /etc/kubernetes/pki/apiserver-etcd-client.crt "${CONTROL_PLANE}":
    scp /etc/kubernetes/pki/apiserver-etcd-client.key "${CONTROL_PLANE}":

    设置第一个控制平面节点

    用以下内容创建一个名为 kubeadm-config.yaml 的文件

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    apiVersion: kubeadm.k8s.io/v1beta3
    kind: ClusterConfiguration
    kubernetesVersion: v1.22.10
    controlPlaneEndpoint: "api.k8s.verbos.com:8443"
    apiServer:
    certSANs:

    - 10.1.6.48
    - 10.1.6.24
    - 10.1.6.45
    etcd:
    external:
    endpoints:
    - https://10.1.6.51:2379 # 适当地更改 ETCD_0_IP
    - https://10.1.6.52:2379 # 适当地更改 ETCD_1_IP
    - https://10.1.6.53:2379 # 适当地更改 ETCD_2_IP
    caFile: /etc/kubernetes/pki/etcd/ca.crt
    certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
    keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
    imageRepository: registry.aliyuncs.com/google_containers
    networking:
    podSubnet: 10.244.0.0/16
    serviceSubnet: 10.10.0.0/16

在节点上运行如下命令

1
kubeadm init --config kubeadm-config.yaml --upload-certs --v=5

注意:如果你的集群初始化成功你将会看到如下信息.

  • 请保存好kubeadm join的内容,默认2小时后过期,过期后重新生成即可
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

kubeadm join api.k8s.verbos.com:8443 --token 9oerz0.fw6s8ft9xa44i077 \
--discovery-token-ca-cert-hash sha256:be3c70562ae6bf8cfcfbbfa3bb8124fe63af3b1a0671e806a4ccf1bc243d5c6b \
--control-plane --certificate-key cdf0f280f9e59e18e5f60d98b624008d828ba00ef096a2e38fd9b6b1463be152

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join api.k8s.verbos.com:8443 --token 9oerz0.fw6s8ft9xa44i077 \
--discovery-token-ca-cert-hash sha256:be3c70562ae6bf8cfcfbbfa3bb8124fe63af3b1a0671e806a4ccf1bc243d5c6b

拷贝集群配置文件

1
2
3
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

运行网络CNI插件(Calico)

注意:此时你应当有一个CNI插件来提供网络服务,如果不安装CNI插件,那么集群将会处于不可用状态.

如果你可以正常访问github

1
[root@containerd-kube-master .kube]# curl https://raw.githubusercontent.com/projectcalico/calico/v3.24.1/manifests/calico.yaml -O

国内用户

1
curl https://projectcalico.docs.tigera.io/manifests/calico.yaml -O

修改Calico.yaml配置

修改CIDR为Kubernetes的子网地址,该地址为serviceSubnet的地址,也就是Pod运行的地址。

1
2
- name: CALICO_IPV4POOL_CIDR
value: "10.10.0.0/16"

执行calico.yaml

1
kubectl apply -f calico.yaml

将其他控制平面节点加入集群

当你已经确保你的containerd-master1完成初始化的时候,你可以将其他的master节点加入到此集群当中.

1
2
3
kubeadm join api.k8s.verbos.com:8443 --token 9oerz0.fw6s8ft9xa44i077 \
--discovery-token-ca-cert-hash sha256:be3c70562ae6bf8cfcfbbfa3bb8124fe63af3b1a0671e806a4ccf1bc243d5c6b \
--control-plane --certificate-key cdf0f280f9e59e18e5f60d98b624008d828ba00ef096a2e38fd9b6b1463be152

加入集群成功后,请拷贝集群配置文件否则将影响kubelet的工作

1
2
3
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

将工作节点加入集群

1
2
kubeadm join api.k8s.verbos.com:8443 --token mck9bg.renw4t689mmx7oe1 \
--discovery-token-ca-cert-hash sha256:ed51fca8615acf5f437f63f66a9709e43ef942caf19aea4c595cb905e4a9a00f

(可选项)修改Kubelet的数据存储目录

如果你需要修改kubelet的数据存储目录,请按照如下方式进行操作

1
vim /etc/sysconfig/kubelet

设置你的kubelet的数据存储目录(建议单独的为kubelet挂载一块数据盘)

1
KUBELET_EXTRA_ARGS="--root-dir=/data/k8s/kubelet"

重启kubelet

1
2
3
systemctl daemon-reloa
systemctl restart kubelet
umount $(df -HT | grep '/var/lib/kubelet/pods' | awk '{print $7}')

查看新的数据目录是否有kubelet的数据

1
2
[root@containerd-master2 kubelet]# ls /data/k8s/kubelet/
cpu_manager_state memory_manager_state plugins plugins_registry pod-resources pods

如何剔除一个Master

  • 先说一个问题,当我们正常如果一个Master要进行某种升级的时候,如何正确的安全的来进行删除该Master。
  1. 首先我们要将该Master设置为不可调度的状态,并且驱逐在其上面的Pod
    1
    2
    3
    4
    5
    6
    # 设置Master为不可调度的状态
    kubectl cordon online-beijing-master1
    # 将Master剔除该集群
    kubectl drain online-beijing-master1 --ignore-daemonsets
    # 然后将其节点从集群中删除
    kubectl delete node online-beijing-master1

这里提一个小问题: 当你使用kubectl drain的时候,如果你当前的Master已经有挂载的emptyDir需要使用--delete-emptydir-data进行删除。
TODO: 删除前请注意备份您的数据。