Kubernetes master节点的高可用配置 / 憋错料

了解Kubernetes架构都知道Master节点在整个集群中的位置，为了保证整个架构的高可用，Kubernetes提供了HA的架构,处于兴趣和对架构的进一步了解,我在自己的电脑实践以下.

环境:

CentOS 7.3,Kubernetes版本

Client Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:57:05Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.1", GitCommit:"82450d03cb057bab0950214ef122b67c83fb11df", GitTreeState:"clean", BuildDate:"2016-12-14T00:52:01Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}

主机环境 /etc/hosts

192.168.0.107    k8s-master1
192.168.0.108    k8s-master2
192.168.0.109    k8s-master3

1.搭建ETCD的集群

禁止selinux以及防火墙

setenforce 0
systemctl stop firewalld
systemctl disable firewalld

安装软件包

yum -y install ntppdate gcc git vim wget

配置定时更新

*/5 * * * * /usr/sbin/ntpdate time.windows.com >/dev/null 2>&1

下载安装包

cd /usr/src
wget https://github.com/coreos/etcd/releases/download/v3.0.15/etcd-v3.0.15-linux-amd64.tar.gz
tar -xvf https://github.com/coreos/etcd/releases/download/v3.0.15/etcd-v3.0.15-linux-amd64.tar.gz
cp  etcd-v3.0.15-linux-amd64/etcd* /usr/local/bin

编写一个deploy-etcd.sh的脚本,并运行

#!/bin/bash

# Copyright 2014 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

## Create etcd.conf, etcd.service, and start etcd service.

ETCD_NAME=`hostname`
ETCD_DATA_DIR=/var/lib/etcd
ETCD_CONF_DIR=/etc/etcd
ETCD_CLUSTER=‘k8s-master1=http://192.168.0.107:2380,k8s-master2=http://192.168.0.108:2380,k8s-master3=http://192.168.0.109:2380‘
ETCD_LISTEN_IP=`ip addr show enp0s3 |grep -w ‘inet‘ |awk -F " " ‘{print $2}‘ |awk -F "/" ‘{print $1}‘`

#useradd etcd
mkdir -p $ETCD_DATA_DIR $ETCD_CONF_DIR
chown -R etcd.etcd $ETCD_DATA_DIR

cat <<EOF >/etc/etcd/etcd.conf
# [member]
ETCD_NAME=${ETCD_NAME}
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
#ETCD_WAL_DIR=""
ETCD_SNAPSHOT_COUNT="10000"
ETCD_HEARTBEAT_INTERVAL="100"
ETCD_ELECTION_TIMEOUT="1000"
ETCD_LISTEN_PEER_URLS="http://${ETCD_LISTEN_IP}:2380"
ETCD_LISTEN_CLIENT_URLS="http://${ETCD_LISTEN_IP}:2379"
ETCD_MAX_SNAPSHOTS="5"
ETCD_MAX_WALS="5"
#ETCD_CORS=""
#
#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://${ETCD_LISTEN_IP}:2380"
# if you use different ETCD_NAME (e.g. test), set ETCD_INITIAL_CLUSTER value for this name, i.e. "test=http://..."
ETCD_INITIAL_CLUSTER="${ETCD_CLUSTER}"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="http://${ETCD_LISTEN_IP}:2379"
#ETCD_DISCOVERY=""
#ETCD_DISCOVERY_SRV=""
#ETCD_DISCOVERY_FALLBACK="proxy"
#ETCD_DISCOVERY_PROXY=""
#ETCD_STRICT_RECONFIG_CHECK="false"
#ETCD_AUTO_COMPACTION_RETENTION="0"
#
#[proxy]
#ETCD_PROXY="off"
#ETCD_PROXY_FAILURE_WAIT="5000"
#ETCD_PROXY_REFRESH_INTERVAL="30000"
#ETCD_PROXY_DIAL_TIMEOUT="1000"
#ETCD_PROXY_WRITE_TIMEOUT="5000"
#ETCD_PROXY_READ_TIMEOUT="0"
#
#[security]
#ETCD_CERT_FILE=""
#ETCD_KEY_FILE=""
#ETCD_CLIENT_CERT_AUTH="false"
#ETCD_TRUSTED_CA_FILE=""
#ETCD_AUTO_TLS="false"
#ETCD_PEER_CERT_FILE=""
#ETCD_PEER_KEY_FILE=""
#ETCD_PEER_CLIENT_CERT_AUTH="false"
#ETCD_PEER_TRUSTED_CA_FILE=""
#ETCD_PEER_AUTO_TLS="false"
#
#[logging]
#ETCD_DEBUG="false"
# examples for -log-package-levels etcdserver=WARNING,security=DEBUG
#ETCD_LOG_PACKAGE_LEVELS=""
EOF

cat <<EOF >//usr/lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
User=etcd
# set GOMAXPROCS to number of processors
#ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/local/bin/etcd --name=\"${ETCD_NAME}\" --data-dir=\"${ETCD_DATA_DIR}\" --listen-client-urls=\"${ETCD_LISTEN_CLIENT_URLS}\""
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/local/bin/etcd"
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

运行如下命令

systemctl daemon-reload
systemctl enable etcd
systemctl restart etcd

etcdctl cluster-health

发现如下错误:

[[email protected] ~]# etcdctl cluster-health
cluster may be unhealthy: failed to list members
Error:  client: etcd cluster is unavailable or misconfigured
error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused

原因是etcdctl总是去找本地的地址,指定endpoint,输出如下:

[[email protected] ~]# etcdctl -endpoints "http://192.168.0.107:2379,http://192.168.0.108:2379,http://192.168.0.109:2379"  cluster-health
member 1578ba76eb3abe05 is healthy: got healthy result from http://192.168.0.108:2379
member beb7fd3596aa26eb is healthy: got healthy result from http://192.168.0.109:2379
member e6bdc10e37172e00 is healthy: got healthy result from http://192.168.0.107:2379
cluster is healthy

2.搭建kubernetes高可用环境

默认master和etcd部署在同一台设备，共三台相互冗余
离线安装的介质可以直接在https://pan.baidu.com/s/1i5jusip 下载
通过HFS启动个http server,安装节点会从这里拉取镜像和rpm

先下载hfs,因为我是在windows 7的主机环境,所以下载一个windows版本,启动以后,将下载的目录和文件都拖到hfs界面中,如图

关掉windows防火墙.

修改k8s-deploy.sh脚本,修改的地方如下

HTTP_SERVER=192.168.0.103:8000
.
.
.
#master侧不需要修改成ip,保持原来的$(master_ip)就可以,但replicate侧需要修改,具体原因还需要查
kube::copy_master_config()
{
    local master_ip=$(etcdctl get ha_master)
    mkdir -p /etc/kubernetes
    scp -r [email protected]192.168.0.107:/etc/kubernetes/* /etc/kubernetes/
    systemctl start kubelet
}

Master节点

curl -L http://192.168.0.101:8000/k8s-deploy.sh | bash -s master \
    --api-advertise-addresses=192.168.0.110     --external-etcd-endpoints=http://192.168.0.107:2379,http://192.168.0.108:2379,http://192.168.0.109:2379

192.168.0.101:8000 是我的http-server, 注意要将k8s-deploy.sh 里的HTTP-SERVER变量也改下
–api-advertise-addresses 是VIP地址
–external-etcd-endpoints 是你的etcd集群地址，这样kubeadm将不再生成etcd.yaml manifest文件
记录下你的token输出， minion侧需要用到

运行完后输出

[init] Using Kubernetes version: v1.5.1
[tokens] Generated token: "e5029f.020306948a9c120f"
[certificates] Generated Certificate Authority key and certificate.
[certificates] Generated API Server key and certificate
[certificates] Generated Service Account signing keys
[certificates] Created keys and certificates in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 23.199910 seconds
[apiclient] Waiting for at least one node to register and become ready
[apiclient] First node is ready after 0.512201 seconds
[apiclient] Creating a test deployment
[apiclient] Test deployment succeeded
[token-discovery] Created the kube-discovery deployment, waiting for it to become ready
[token-discovery] kube-discovery is ready after 2.004430 seconds
[addons] Created essential addon: kube-proxy
[addons] Created essential addon: kube-dns

Your Kubernetes master has initialized successfully!

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
    http://kubernetes.io/docs/admin/addons/

You can now join any number of machines by running the following on each node:

kubeadm join --token=e5029f.020306948a9c120f 192.168.0.110
+ echo -e ‘\033[32m 赶紧找地方记录上面的token！ \033[0m‘
 赶紧找地方记录上面的token！
+ kubectl apply -f http://192.168.0.101:8000/network/kube-flannel.yaml --namespace=kube-system
serviceaccount "flannel" created
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created
+ kubectl get po --all-namespaces
NAMESPACE     NAME                              READY     STATUS              RESTARTS   AGE
kube-system   dummy-2088944543-fjhbc            1/1       Running             0          7s
kube-system   kube-discovery-1769846148-ks84b   1/1       Running             0          6s
kube-system   kube-dns-2924299975-zg6b8         0/4       ContainerCreating   0          3s
kube-system   kube-flannel-ds-jzq98             0/2       Pending             0          1s
kube-system   kube-proxy-c0mx7                  0/1       ContainerCreating   0          3s

Relica Master节点

curl -L http://192.168.0.103:8000/k8s-deploy.sh | bash -s replica \
    --api-advertise-addresses=192.168.0.110     --external-etcd-endpoints=http://192.168.0.107:2379,http://192.168.0.108:2379,http://192.168.0.109:2379

输出

++ hostname
+ grep k8s-master2
k8s-master2   Ready          30s
++ hostname
+ kubectl label node k8s-master2 kubeadm.alpha.kubernetes.io/role=master
node "k8s-master2" labeled

建立了3个节点的HA集群后,先运行命令查看情况

[[email protected] ~]# kubectl get nodes
NAME          STATUS         AGE
k8s-master1   Ready,master   11h
k8s-master2   Ready,master   5m
k8s-master3   Ready,master   9h
[[email protected]-master2 ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                                  READY     STATUS    RESTARTS   AGE
kube-system   dummy-2088944543-fjhbc                1/1       Running   1          11h
kube-system   kube-apiserver-k8s-master1            1/1       Running   4          11h
kube-system   kube-apiserver-k8s-master2            1/1       Running   0          5m
kube-system   kube-apiserver-k8s-master3            1/1       Running   1          9h
kube-system   kube-controller-manager-k8s-master1   1/1       Running   4          11h
kube-system   kube-controller-manager-k8s-master2   1/1       Running   0          5m
kube-system   kube-controller-manager-k8s-master3   1/1       Running   1          9h
kube-system   kube-discovery-1769846148-ks84b       1/1       Running   1          11h
kube-system   kube-dns-2924299975-zg6b8             4/4       Running   10         11h
kube-system   kube-flannel-ds-37zsp                 2/2       Running   3          9h
kube-system   kube-flannel-ds-8kwnh                 2/2       Running   0          5m
kube-system   kube-flannel-ds-jzq98                 2/2       Running   4          11h
kube-system   kube-proxy-c0mx7                      1/1       Running   1          11h
kube-system   kube-proxy-r9nmw                      1/1       Running   1          9h
kube-system   kube-proxy-rbxf7                      1/1       Running   0          5m
kube-system   kube-scheduler-k8s-master1            1/1       Running   3          11h
kube-system   kube-scheduler-k8s-master2            1/1       Running   0          5m
kube-system   kube-scheduler-k8s-master3            1/1       Running   1          9h

关掉一个master1,验证vip

64 bytes from 192.168.0.110: icmp_seq=129 ttl=64 time=0.049 ms
64 bytes from 192.168.0.110: icmp_seq=130 ttl=64 time=0.050 ms
64 bytes from 192.168.0.110: icmp_seq=131 ttl=64 time=0.049 ms
64 bytes from 192.168.0.110: icmp_seq=132 ttl=64 time=0.049 ms
64 bytes from 192.168.0.110: icmp_seq=133 ttl=64 time=0.049 ms
64 bytes from 192.168.0.110: icmp_seq=134 ttl=64 time=0.099 ms
64 bytes from 192.168.0.110: icmp_seq=135 ttl=64 time=0.048 ms

Minion节点

curl -L http://192.168.0.103:8000/k8s-deploy.sh |  bash -s join --token=e5029f.020306948a9c120f 192.168.0.110

token是第一个master节点生成
192.168.0.110是浮动vip
因为资源有限没有验证minion节点部署

验证未完待续.

时间： 2024-10-31 06:01:01

Kubernetes master节点的高可用配置

Kubernetes master节点的高可用配置的相关文章

kuberenetes 的多节点集群与高可用配置

kubeadm部署kubernetes v1.17.4 高可用master节点

基于keepalived的Haproxy高可用配置

ActiveMQ + ZooKeeper 集群高可用配置

MySQL高可用配置

RHEL6搭建 keepalived (HA) 任意单故障节点的高可用集群

MySQL之MHA高可用配置及故障切换

keepalive && Nginx 高可用配置

RabbitMQ 集群与高可用配置