01.系统初始化和全局变量 集群机器 test1:192.168.0.91 test2:192.168.0.92 test3:192.168.0.93 主机名 设置永久主机名称,然后重新登录: sudo hostnamectl set-hostname test1 # 将 test1 替换为当前主机名 设置的主机名保存在 /etc/hostname 文件中; 修改每台机器的 /etc/hosts 文件,添加主机名和 IP 的对应关系: grep kube-node /etc/hosts 192.168.0.91 test1 test1 192.168.0.92 test2 test2 192.168.0.93 test3 test3 添加 k8s 和 docker 账户 在每台机器上添加 k8s 账户,可以无密码 sudo: sudo useradd -m k8s sudo sh -c ‘echo 123456 | passwd k8s --stdin‘ # 为 k8s 账户设置密码 sudo visudo sudo grep ‘%wheel.*NOPASSWD: ALL‘ /etc/sudoers %wheel ALL=(ALL) NOPASSWD: ALL sudo gpasswd -a k8s wheel 在每台机器上添加 docker 账户,将 k8s 账户添加到 docker 组中,同时配置 dockerd 参数: sudo useradd -m docker sudo gpasswd -a k8s docker sudo mkdir -p /etc/docker/ cat /etc/docker/daemon.json { "registry-mirrors": ["https://hub-mirror.c.163.com", "https://docker.mirrors.ustc.edu.cn"], "max-concurrent-downloads": 20 } 无密码 ssh 登录其它节点 如果没有特殊指明,本文档的所有操作均在 test1 节点上执行,然后远程分发文件和执行命令。? 设置 test1 可以无密码登录所有节点的 k8s 和 root 账户: [[email protected] k8s]$ ssh-keygen -t rsa [[email protected] k8s]$ ssh-copy-id [email protected] [[email protected] k8s]$ ssh-copy-id [email protected] [[email protected] k8s]$ ssh-copy-id [email protected] [[email protected] k8s]$ ssh-copy-id [email protected] [[email protected] k8s]$ ssh-copy-id [email protected] [[email protected] k8s]$ ssh-copy-id [email protected] 将可执行文件路径 /opt/k8s/bin 添加到 PATH 变量中 在每台机器上添加环境变量: sudo sh -c "echo ‘PATH=/opt/k8s/bin:$PATH:$HOME/bin:$JAVA_HOME/bin‘ >>/root/.bashrc" echo ‘PATH=/opt/k8s/bin:$PATH:$HOME/bin:$JAVA_HOME/bin‘ >>~/.bashrc 安装依赖包 在每台机器上安装依赖包: CentOS: sudo yum install -y epel-release sudo yum install -y conntrack ipvsadm ipset jq sysstat curl iptables libseccomp ipvs 依赖 ipset 关闭防火墙 在每台机器上关闭防火墙: sudo systemctl stop firewalld sudo systemctl disable firewalld sudo iptables -F && sudo iptables -X && sudo iptables -F -t nat && sudo iptables -X -t nat sudo sudo iptables -P FORWARD ACCEPT 关闭 swap 分区 如果开启了 swap 分区,kubelet 会启动失败(可以通过将参数 --fail-swap-on 设置为 false 来忽略 swap on),故需要在每台机器上关闭 swap 分区: sudo swapoff -a 为了防止开机自动挂载 swap 分区,可以注释 /etc/fstab 中相应的条目: sudo sed -i ‘/ swap / s/^\(.*\)$/#\1/g‘ /etc/fstab 关闭 SELinux 关闭 SELinux,否则后续 K8S 挂载目录时可能报错 Permission denied: sudo setenforce 0 grep SELINUX /etc/selinux/config SELINUX=disabled 修改配置文件,永久生效; 关闭 dnsmasq linux 系统开启了 dnsmasq 后(如 GUI 环境),将系统 DNS Server 设置为 127.0.0.1,这会导致 docker 容器无法解析域名,需要关闭它: sudo service dnsmasq stop sudo systemctl disable dnsmasq 设置系统参数 cat > kubernetes.conf <<EOF net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables=1 net.ipv4.ip_forward=1 vm.swappiness=0 vm.overcommit_memory=1 vm.panic_on_oom=0 fs.inotify.max_user_watches=89100 EOF sudo cp kubernetes.conf /etc/sysctl.d/kubernetes.conf sudo sysctl -p /etc/sysctl.d/kubernetes.conf sudo mount -t cgroup -o cpu,cpuacct none /sys/fs/cgroup/cpu,cpuacct 加载内核模块 sudo modprobe br_netfilter sudo modprobe ip_vs 设置系统时区 sudo timedatectl set-timezone Asia/Shanghai 将当前的 UTC 时间写入硬件时钟 sudo timedatectl set-local-rtc 0 重启依赖于系统时间的服务 sudo systemctl restart rsyslog sudo systemctl restart crond 创建目录 在每台机器上创建目录: sudo mkdir -p /opt/k8s/bin sudo chown -R k8s /opt/k8s sudo sudo mkdir -p /etc/kubernetes/cert sudo chown -R k8s /etc/kubernetes sudo mkdir -p /etc/etcd/cert sudo chown -R k8s /etc/etcd/cert sudo mkdir -p /var/lib/etcd && chown -R k8s /etc/etcd/cert 集群环境变量 后续的部署步骤将使用下面定义的全局环境变量,请根据自己的机器、网络情况修改: #!/usr/bin/bash # 生成 EncryptionConfig 所需的加密 key ENCRYPTION_KEY=$(head -c 32 /dev/urandom | base64) # 最好使用 当前未用的网段 来定义服务网段和 Pod 网段 # 服务网段,部署前路由不可达,部署后集群内路由可达(kube-proxy 和 ipvs 保证) SERVICE_CIDR="10.254.0.0/16" # Pod 网段,建议 /16 段地址,部署前路由不可达,部署后集群内路由可达(flanneld 保证) CLUSTER_CIDR="172.30.0.0/16" # 服务端口范围 (NodePort Range) export NODE_PORT_RANGE="8400-9000" # 集群各机器 IP 数组 export NODE_IPS=(192.168.0.91 192.168.0.92 192.168.0.93) # 集群各 IP 对应的 主机名数组 export NODE_NAMES=(test1 test2 test3) # kube-apiserver 的 VIP(HA 组件 keepalived 发布的 IP) export MASTER_VIP="192.168.0.235" # kube-apiserver VIP 地址(HA 组件 haproxy 监听 8443 端口) export KUBE_APISERVER="https://${MASTER_VIP}:8443" # HA 节点,VIP 所在的网络接口名称 export VIP_IF="eth0" # etcd 集群服务地址列表 export ETCD_ENDPOINTS="https://192.168.0.91:2379,https://192.168.0.92:2379,https://192.168.0.93:2379" # etcd 集群间通信的 IP 和端口 export ETCD_NODES="test1=https://192.168.0.91:2380,test2=https://192.168.0.92:2380,test3=https://192.168.0.93:2380" # flanneld 网络配置前缀 export FLANNEL_ETCD_PREFIX="/kubernetes/network" # kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP) export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1" # 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配) export CLUSTER_DNS_SVC_IP="10.254.0.2" # 集群 DNS 域名 export CLUSTER_DNS_DOMAIN="cluster.local." # 将二进制目录 /opt/k8s/bin 加到 PATH 中 export PATH=/opt/k8s/bin:$PATH 打包后的变量定义见 environment.sh,后续部署时会提示导入该脚本; 分发集群环境变量定义脚本 把全局变量定义脚本拷贝到所有节点的 /opt/k8s/bin 目录: source environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp environment.sh [email protected]${node_ip}:/opt/k8s/bin/ ssh [email protected]${node_ip} "chmod +x /opt/k8s/bin/*" done 02.创建 CA 证书和秘钥 为确保安全,kubernetes 系统各组件需要使用 x509 证书对通信进行加密和认证。 CA (Certificate Authority) 是自签名的根证书,用来签名后续创建的其它证书。 本文档使用 CloudFlare 的 PKI 工具集 cfssl 创建所有证书。 安装 cfssl 工具集 sudo mkdir -p /opt/k8s/cert && sudo chown -R k8s /opt/k8s && cd /opt/k8s wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 mv cfssl_linux-amd64 /opt/k8s/bin/cfssl wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 mv cfssljson_linux-amd64 /opt/k8s/bin/cfssljson wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64 mv cfssl-certinfo_linux-amd64 /opt/k8s/bin/cfssl-certinfo chmod +x /opt/k8s/bin/* export PATH=/opt/k8s/bin:$PATH 创建根证书 (CA) CA 证书是集群所有节点共享的,只需要创建一个 CA 证书,后续创建的所有证书都由它签名。 创建配置文件 CA 配置文件用于配置根证书的使用场景 (profile) 和具体参数 (usage,过期时间、服务端认证、客户端认证、加密等),后续在签名其它证书时需要指定特定场景。 cat > ca-config.json <<EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "kubernetes": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "87600h" } } } } EOF signing:表示该证书可用于签名其它证书,生成的 ca.pem 证书中 CA=TRUE; server auth:表示 client 可以用该该证书对 server 提供的证书进行验证; client auth:表示 server 可以用该该证书对 client 提供的证书进行验证; 创建证书签名请求文件 cat > ca-csr.json <<EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF CN:Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name),浏览器使用该字段验证网站是否合法; O:Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group); kube-apiserver 将提取的 User、Group 作为 RBAC 授权的用户标识; 生成 CA 证书和私钥 cfssl gencert -initca ca-csr.json | cfssljson -bare ca ls ca* 分发证书文件 将生成的 CA 证书、秘钥文件、配置文件拷贝到所有节点的 /etc/kubernetes/cert 目录下: source /opt/k8s/bin/environment.sh # 导入 NODE_IPS 环境变量 for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /etc/kubernetes/cert && chown -R k8s /etc/kubernetes" scp ca*.pem ca-config.json [email protected]${node_ip}:/etc/kubernetes/cert done k8s 账户需要有读写 /etc/kubernetes 目录及其子目录文件的权限; 03.部署 kubectl 命令行工具 kubectl 是 kubernetes 集群的命令行管理工具,本文档介绍安装和配置它的步骤。 kubectl 默认从 ~/.kube/config 文件读取 kube-apiserver 地址、证书、用户名等信息,如果没有配置,执行 kubectl 命令时会报如下错误: kubectl get pods The connection to the server localhost:8080 was refused - did you specify the right host or port? 下载和分发 kubectl 二进制文件 下载、解压: wget https://dl.k8s.io/v1.10.4/kubernetes-client-linux-amd64.tar.gz tar -xzvf kubernetes-client-linux-amd64.tar.gz 分发到所有使用 kubectl 的节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kubernetes/client/bin/kubectl [email protected]${node_ip}:/opt/k8s/bin/ ssh [email protected]${node_ip} "chmod +x /opt/k8s/bin/*" done 创建 admin 证书和私钥 kubectl 与 apiserver https 安全端口通信,apiserver 对提供的证书进行认证和授权。 kubectl 作为集群的管理工具,需要被授予最高权限。这里创建具有最高权限的 admin 证书。 创建证书签名请求: cat > admin-csr.json <<EOF { "CN": "admin", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:masters", "OU": "4Paradigm" } ] } EOF O 为 system:masters,kube-apiserver 收到该证书后将请求的 Group 设置为 system:masters; 预定义的 ClusterRoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予所有 API的权限; 该证书只会被 kubectl 当做 client 证书使用,所以 hosts 字段为空; 生成证书和私钥: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes admin-csr.json | cfssljson -bare admin ls admin* 创建 kubeconfig 文件 kubeconfig 为 kubectl 的配置文件,包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书; source /opt/k8s/bin/environment.sh # 设置集群参数 kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kubectl.kubeconfig # 设置客户端认证参数 kubectl config set-credentials admin --client-certificate=admin.pem --client-key=admin-key.pem --embed-certs=true --kubeconfig=kubectl.kubeconfig # 设置上下文参数 kubectl config set-context kubernetes --cluster=kubernetes --user=admin --kubeconfig=kubectl.kubeconfig # 设置默认上下文 kubectl config use-context kubernetes --kubeconfig=kubectl.kubeconfig --certificate-authority:验证 kube-apiserver 证书的根证书; --client-certificate、--client-key:生成的 admin 证书和私钥,连接 kube-apiserver 时使用 --embed-certs=true:将 ca.pem 和 admin.pem 证书内容嵌入到生成的 kubectl.kubeconfig 文件中(不加时,写入的是证书文件路径); 分发 kubeconfig 文件 分发到所有使用 kubectl 命令的节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p ~/.kube" scp kubectl.kubeconfig [email protected]${node_ip}:~/.kube/config ssh [email protected]${node_ip} "mkdir -p ~/.kube" scp kubectl.kubeconfig [email protected]${node_ip}:~/.kube/config done 保存到用户的 ~/.kube/config 文件; 04.部署 etcd 集群 etcd 是基于 Raft 的分布式 key-value 存储系统,由 CoreOS 开发,常用于服务发现、共享配置以及并发控制(如 leader 选举、分布式锁等)。kubernetes 使用 etcd 存储所有运行数据。 本文档介绍部署一个三节点高可用 etcd 集群的步骤: 下载和分发 etcd 二进制文件; 创建 etcd 集群各节点的 x509 证书,用于加密客户端(如 etcdctl) 与 etcd 集群、etcd 集群之间的数据流; 创建 etcd 的 systemd unit 文件,配置服务参数; 检查集群工作状态; etcd 集群各节点的名称和 IP 如下: test1:192.168.0.91 test2:192.168.0.92 test3:192.168.0.93 下载和分发 etcd 二进制文件 tar -xvf etcd-v3.3.7-linux-amd64.tar.gz 分发二进制文件到集群所有节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp etcd-v3.3.7-linux-amd64/etcd* [email protected]${node_ip}:/opt/k8s/bin ssh [email protected]${node_ip} "chmod +x /opt/k8s/bin/*" done 创建 etcd 证书和私钥 创建证书签名请求: cat > etcd-csr.json <<EOF { "CN": "etcd", "hosts": [ "127.0.0.1", "192.168.0.91", "192.168.0.92", "192.168.0.93" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF hosts 字段指定授权使用该证书的 etcd 节点 IP 或域名列表,这里将 etcd 集群的三个节点 IP 都列在其中; 生成证书和私钥: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd ls etcd* 分发生成的证书和私钥到各 etcd 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /etc/etcd/cert && chown -R k8s /etc/etcd/cert" scp etcd*.pem [email protected]${node_ip}:/etc/etcd/cert/ done 创建 etcd 的 systemd unit 模板文件 source /opt/k8s/bin/environment.sh cat > etcd.service.template <<EOF [Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target Documentation=https://github.com/coreos [Service] User=k8s Type=notify WorkingDirectory=/var/lib/etcd/ ExecStart=/opt/k8s/bin/etcd \ --data-dir=/var/lib/etcd \ --name=##NODE_NAME## \\ --cert-file=/etc/etcd/cert/etcd.pem \ --key-file=/etc/etcd/cert/etcd-key.pem \ --trusted-ca-file=/etc/kubernetes/cert/ca.pem \ --peer-cert-file=/etc/etcd/cert/etcd.pem \ --peer-key-file=/etc/etcd/cert/etcd-key.pem \ --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \ --peer-client-cert-auth \ --client-cert-auth \ --listen-peer-urls=https://##NODE_IP##:2380 \\ --initial-advertise-peer-urls=https://##NODE_IP##:2380 \\ --listen-client-urls=https://##NODE_IP##:2379,http://127.0.0.1:2379 \\ --advertise-client-urls=https://##NODE_IP##:2379 \\ --initial-cluster-token=etcd-cluster-0 \ --initial-cluster=${ETCD_NODES} \ --initial-cluster-state=new Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF User:指定以 k8s 账户运行; WorkingDirectory、--data-dir:指定工作目录和数据目录为 /var/lib/etcd,需在启动服务前创建这个目录; --name:指定节点名称,当 --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中; --cert-file、--key-file:etcd server 与 client 通信时使用的证书和私钥; --trusted-ca-file:签名 client 证书的 CA 证书,用于验证 client 证书 --peer-cert-file、--peer-key-file:etcd 与 peer 通信使用的证书和私钥; --peer-trusted-ca-file:签名 peer 证书的 CA 证书,用于验证 peer 证书; 为各节点分发 etcd systemd unit 文件 分发时替换模板文件中的变量 source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" etcd.service.template > etcd-${NODE_IPS[i]}.service done ls *.service NODE_NAMES 和 NODE_IPS 为相同长度的 bash 数组,分别为节点名称和对应的 IP; 分发生成的 systemd unit 文件: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /var/lib/etcd && chown -R k8s /var/lib/etcd" scp etcd-${node_ip}.service [email protected]${node_ip}:/etc/systemd/system/etcd.service done 必须先创建 etcd 数据目录和工作目录; 启动 etcd 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl daemon-reload && systemctl enable etcd && systemctl restart etcd &" done etcd 进程首次启动时会等待其它节点的 etcd 加入集群,命令 systemctl start etcd 会卡住一段时间,为正常现象。 检查启动结果 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl status etcd|grep Active" done 验证服务状态 部署完 etcd 集群后,在任一 etc 节点上执行如下命令: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ETCDCTL_API=3 /opt/k8s/bin/etcdctl --endpoints=https://${node_ip}:2379 --cacert=/etc/kubernetes/cert/ca.pem --cert=/etc/etcd/cert/etcd.pem --key=/etc/etcd/cert/etcd-key.pem endpoint health done 预期输出: https://192.168.0.91:2379 is healthy: successfully committed proposal: took = 2.192932ms https://192.168.0.92:2379 is healthy: successfully committed proposal: took = 3.546896ms https://192.168.0.93:2379 is healthy: successfully committed proposal: took = 3.013667ms 输出均为 healthy 时表示集群服务正常。 05.部署 flannel 网络 kubernetes 要求集群内各节点(包括 master 节点)能通过 Pod 网段互联互通。flannel 使用 vxlan 技术为各节点创建一个可以互通的 Pod 网络。 flaneel 第一次启动时,从 etcd 获取 Pod 网段信息,为本节点分配一个未使用的 /24 段地址,然后创建 flannedl.1(也可能是其它名称,如 flannel1 等) 接口。 flannel 将分配的 Pod 网段信息写入 /run/flannel/docker 文件,docker 后续使用这个文件中的环境变量设置 docker0 网桥。 下载和分发 flanneld 二进制文件 到 https://github.com/coreos/flannel/releases 页面下载最新版本的发布包: mkdir flannel wget https://github.com/coreos/flannel/releases/download/v0.10.0/flannel-v0.10.0-linux-amd64.tar.gz tar -xzvf flannel-v0.10.0-linux-amd64.tar.gz -C flannel 分发 flanneld 二进制文件到集群所有节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp flannel/{flanneld,mk-docker-opts.sh} [email protected]${node_ip}:/opt/k8s/bin/ ssh [email protected]${node_ip} "chmod +x /opt/k8s/bin/*" done 创建 flannel 证书和私钥 flannel 从 etcd 集群存取网段分配信息,而 etcd 集群启用了双向 x509 证书认证,所以需要为 flanneld 生成证书和私钥。 创建证书签名请求: cat > flanneld-csr.json <<EOF { "CN": "flanneld", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF 该证书只会被 kubectl 当做 client 证书使用,所以 hosts 字段为空; 生成证书和私钥: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld ls flanneld*pem 将生成的证书和私钥分发到所有节点(master 和 worker): source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /etc/flanneld/cert && chown -R k8s /etc/flanneld" scp flanneld*.pem [email protected]${node_ip}:/etc/flanneld/cert done 向 etcd 写入集群 Pod 网段信息 注意:本步骤只需执行一次。 source /opt/k8s/bin/environment.sh etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/flanneld/cert/flanneld.pem --key-file=/etc/flanneld/cert/flanneld-key.pem set ${FLANNEL_ETCD_PREFIX}/config ‘{"Network":"‘${CLUSTER_CIDR}‘", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}‘ flanneld 当前版本 (v0.10.0) 不支持 etcd v3,故使用 etcd v2 API 写入配置 key 和网段数据; 写入的 Pod 网段 ${CLUSTER_CIDR} 必须是 /16 段地址,必须与 kube-controller-manager 的 --cluster-cidr 参数值一致; 创建 flanneld 的 systemd unit 文件 source /opt/k8s/bin/environment.sh export IFACE=eth0 cat > flanneld.service << EOF [Unit] Description=Flanneld overlay address etcd agent After=network.target After=network-online.target Wants=network-online.target After=etcd.service Before=docker.service [Service] Type=notify ExecStart=/opt/k8s/bin/flanneld \ -etcd-cafile=/etc/kubernetes/cert/ca.pem \ -etcd-certfile=/etc/flanneld/cert/flanneld.pem \ -etcd-keyfile=/etc/flanneld/cert/flanneld-key.pem \ -etcd-endpoints=${ETCD_ENDPOINTS} \ -etcd-prefix=${FLANNEL_ETCD_PREFIX} \ -iface=${IFACE} ExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker Restart=on-failure [Install] WantedBy=multi-user.target RequiredBy=docker.service EOF mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网网段信息写入 /run/flannel/docker 文件,后续 docker 启动时使用这个文件中的环境变量配置 docker0 网桥; flanneld 使用系统缺省路由所在的接口与其它节点通信,对于有多个网络接口(如内网和公网)的节点,可以用 -iface 参数指定通信接口,如上面的 eth0 接口; flanneld 运行时需要 root 权限; 分发 flanneld systemd unit 文件到所有节点 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp flanneld.service [email protected]${node_ip}:/etc/systemd/system/ done 启动 flanneld 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld" done 检查启动结果 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl status flanneld|grep Active" done 确保状态为 active (running),否则查看日志,确认原因: $ journalctl -u flanneld 检查分配给各 flanneld 的 Pod 网段信息 查看集群 Pod 网段(/16): source /opt/k8s/bin/environment.sh etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/flanneld/cert/flanneld.pem --key-file=/etc/flanneld/cert/flanneld-key.pem get ${FLANNEL_ETCD_PREFIX}/config 输出: {"Network":"172.30.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}} 查看已分配的 Pod 子网段列表(/24): source /opt/k8s/bin/environment.sh etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/flanneld/cert/flanneld.pem --key-file=/etc/flanneld/cert/flanneld-key.pem ls ${FLANNEL_ETCD_PREFIX}/subnets 输出: /kubernetes/network/subnets/172.30.81.0-24 /kubernetes/network/subnets/172.30.29.0-24 /kubernetes/network/subnets/172.30.39.0-24 查看某一 Pod 网段对应的节点 IP 和 flannel 接口地址: source /opt/k8s/bin/environment.sh etcdctl --endpoints=${ETCD_ENDPOINTS} --ca-file=/etc/kubernetes/cert/ca.pem --cert-file=/etc/flanneld/cert/flanneld.pem --key-file=/etc/flanneld/cert/flanneld-key.pem get ${FLANNEL_ETCD_PREFIX}/subnets/172.30.81.0-24 输出: {"PublicIP":"192.168.0.91","BackendType":"vxlan","BackendData":{"VtepMAC":"12:21:93:9e:b1:eb"}} 验证各节点能通过 Pod 网段互通 在各节点上部署 flannel 后,检查是否创建了 flannel 接口(名称可能为 flannel0、flannel.0、flannel.1 等): source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh ${node_ip} "/usr/sbin/ip addr show flannel.1|grep -w inet" done 输出: inet 172.30.81.0/32 scope global flannel.1 inet 172.30.29.0/32 scope global flannel.1 inet 172.30.39.0/32 scope global flannel.1 在各节点上 ping 所有 flannel 接口 IP,确保能通: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh ${node_ip} "ping -c 1 172.30.81.0" ssh ${node_ip} "ping -c 1 172.30.29.0" ssh ${node_ip} "ping -c 1 172.30.39.0" done 06-0.部署 master 节点 kubernetes master 节点运行如下组件: kube-apiserver kube-scheduler kube-controller-manager kube-scheduler 和 kube-controller-manager 可以以集群模式运行,通过 leader 选举产生一个工作进程,其它进程处于阻塞模式。 对于 kube-apiserver,可以运行多个实例(本文档是 3 实例),但对其它组件需要提供统一的访问地址,该地址需要高可用。本文档使用 keepalived 和 haproxy 实现 kube-apiserver VIP 高可用和负载均衡。 下载最新版本的二进制文件 从 CHANGELOG页面 下载 server tarball 文件。 wget https://dl.k8s.io/v1.10.4/kubernetes-server-linux-amd64.tar.gz tar -xzvf kubernetes-server-linux-amd64.tar.gz cd kubernetes tar -xzvf kubernetes-src.tar.gz 将二进制文件拷贝到所有 master 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp server/bin/* [email protected]${node_ip}:/opt/k8s/bin/ ssh [email protected]${node_ip} "chmod +x /opt/k8s/bin/*" done 06-1.部署高可用组件 本文档讲解使用 keepalived 和 haproxy 实现 kube-apiserver 高可用的步骤: keepalived 提供 kube-apiserver 对外服务的 VIP; haproxy 监听 VIP,后端连接所有 kube-apiserver 实例,提供健康检查和负载均衡功能; 运行 keepalived 和 haproxy 的节点称为 LB 节点。由于 keepalived 是一主多备运行模式,故至少两个 LB 节点。 本文档复用 master 节点的三台机器,haproxy 监听的端口(8443) 需要与 kube-apiserver 的端口 6443 不同,避免冲突。 keepalived 在运行过程中周期检查本机的 haproxy 进程状态,如果检测到 haproxy 进程异常,则触发重新选主的过程,VIP 将飘移到新选出来的主节点,从而实现 VIP 的高可用。 所有组件(如 kubeclt、apiserver、controller-manager、scheduler 等)都通过 VIP 和 haproxy 监听的 8443 端口访问 kube-apiserver 服务。 安装软件包 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "yum install -y keepalived haproxy" done 配置和下发 haproxy 配置文件 haproxy 配置文件: cat > haproxy.cfg <<EOF global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /var/run/haproxy-admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon nbproc 1 defaults log global timeout connect 5000 timeout client 10m timeout server 10m listen admin_stats bind 0.0.0.0:10080 mode http log 127.0.0.1 local0 err stats refresh 30s stats uri /status stats realm welcome login\ Haproxy stats auth admin:123456 stats hide-version stats admin if TRUE listen kube-master bind 0.0.0.0:8443 mode tcp option tcplog balance source server 192.168.0.91 192.168.0.91:6443 check inter 2000 fall 2 rise 2 weight 1 server 192.168.0.92 192.168.0.92:6443 check inter 2000 fall 2 rise 2 weight 1 server 192.168.0.93 192.168.0.93:6443 check inter 2000 fall 2 rise 2 weight 1 EOF haproxy 在 10080 端口输出 status 信息; haproxy 监听所有接口的 8443 端口,该端口与环境变量 ${KUBE_APISERVER} 指定的端口必须一致; server 字段列出所有 kube-apiserver 监听的 IP 和端口; 下发 haproxy.cfg 到所有 master 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp haproxy.cfg [email protected]${node_ip}:/etc/haproxy done 起 haproxy 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl restart haproxy" done 检查 haproxy 服务状态 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl status haproxy|grep Active" done 确保状态为 active (running),否则查看日志,确认原因: journalctl -u haproxy 检查 haproxy 是否监听 8443 端口: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "netstat -lnpt|grep haproxy" done 确保输出类似于: tcp 0 0 0.0.0.0:8443 0.0.0.0:* LISTEN 120583/haproxy 配置和下发 keepalived 配置文件 keepalived 是一主(master)多备(backup)运行模式,故有两种类型的配置文件。master 配置文件只有一份,backup 配置文件视节点数目而定,对于本文档而言,规划如下: master: 192.168.0.91 backup:192.168.0.92、192.168.0.93 master 配置文件: source /opt/k8s/bin/environment.sh cat > keepalived-master.conf <<EOF global_defs { router_id lb-master-105 } vrrp_script check-haproxy { script "killall -0 haproxy" interval 5 weight -30 } vrrp_instance VI-kube-master { state MASTER priority 120 dont_track_primary interface ${VIP_IF} virtual_router_id 68 advert_int 3 track_script { check-haproxy } virtual_ipaddress { ${MASTER_VIP} } } EOF VIP 所在的接口(interface ${VIP_IF})为 eth0; 使用 killall -0 haproxy 命令检查所在节点的 haproxy 进程是否正常。如果异常则将权重减少(-30),从而触发重新选主过程; router_id、virtual_router_id 用于标识属于该 HA 的 keepalived 实例,如果有多套 keepalived HA,则必须各不相同; backup 配置文件: source /opt/k8s/bin/environment.sh cat > keepalived-backup.conf <<EOF global_defs { router_id lb-backup-105 } vrrp_script check-haproxy { script "killall -0 haproxy" interval 5 weight -30 } vrrp_instance VI-kube-master { state BACKUP priority 110 dont_track_primary interface ${VIP_IF} virtual_router_id 68 advert_int 3 track_script { check-haproxy } virtual_ipaddress { ${MASTER_VIP} } } EOF VIP 所在的接口(interface ${VIP_IF})为 eth0; 使用 killall -0 haproxy 命令检查所在节点的 haproxy 进程是否正常。如果异常则将权重减少(-30),从而触发重新选主过程; router_id、virtual_router_id 用于标识属于该 HA 的 keepalived 实例,如果有多套 keepalived HA,则必须各不相同; priority 的值必须小于 master 的值; 下发 keepalived 配置文件 下发 master 配置文件: scp keepalived-master.conf [email protected]:/etc/keepalived/keepalived.conf 下发 backup 配置文件: scp keepalived-backup.conf [email protected]:/etc/keepalived/keepalived.conf scp keepalived-backup.conf [email protected]:/etc/keepalived/keepalived.conf 起 keepalived 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl restart keepalived" done 检查 keepalived 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl status keepalived|grep Active" done 确保状态为 active (running),否则查看日志,确认原因: journalctl -u keepalived 查看 VIP 所在的节点,确保可以 ping 通 VIP: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh ${node_ip} "/usr/sbin/ip addr show ${VIP_IF}" ssh ${node_ip} "ping -c 1 ${MASTER_VIP}" done 查看 haproxy 状态页面 浏览器访问 ${MASTER_VIP}:10080/status 地址,查看 haproxy 状态页面: 06-1.部署 kube-apiserver 组件 本文档讲解使用 keepalived 和 haproxy 部署一个 3 节点高可用 master 集群的步骤,对应的 LB VIP 为环境变量 ${MASTER_VIP}。 准备工作 下载最新版本的二进制文件、安装和配置 flanneld 参考:06-0.部署master节点.md 创建 kubernetes 证书和私钥 创建证书签名请求: source /opt/k8s/bin/environment.sh cat > kubernetes-csr.json <<EOF { "CN": "kubernetes", "hosts": [ "127.0.0.1", "192.168.0.91", "192.168.0.92", "192.168.0.93", "${MASTER_VIP}", "${CLUSTER_KUBERNETES_SVC_IP}", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF hosts 字段指定授权使用该证书的 IP 或域名列表,这里列出了 VIP 、apiserver 节点 IP、kubernetes 服务 IP 和域名; 域名最后字符不能是 .(如不能为 kubernetes.default.svc.cluster.local.),否则解析时失败,提示: x509: cannot parse dnsName "kubernetes.default.svc.cluster.local."; 如果使用非 cluster.local 域名,如 opsnull.com,则需要修改域名列表中的最后两个域名为:kubernetes.default.svc.opsnull、kubernetes.default.svc.opsnull.com kubernetes 服务 IP 是 apiserver 自动创建的,一般是 --service-cluster-ip-range 参数指定的网段的第一个IP,后续可以通过如下命令获取: $ kubectl get svc kubernetes NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes 10.254.0.1 <none> 443/TCP 1d 生成证书和私钥: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes ls kubernetes*pem 将生成的证书和私钥文件拷贝到 master 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /etc/kubernetes/cert/ && sudo chown -R k8s /etc/kubernetes/cert/" scp kubernetes*.pem [email protected]${node_ip}:/etc/kubernetes/cert/ done k8s 账户可以读写 /etc/kubernetes/cert/ 目录; 创建加密配置文件 source /opt/k8s/bin/environment.sh cat > encryption-config.yaml <<EOF kind: EncryptionConfig apiVersion: v1 resources: - resources: - secrets providers: - aescbc: keys: - name: key1 secret: ${ENCRYPTION_KEY} - identity: {} EOF 将加密配置文件拷贝到 master 节点的 /etc/kubernetes 目录下: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp encryption-config.yaml [email protected]${node_ip}:/etc/kubernetes/ done 替换后的 encryption-config.yaml 文件:encryption-config.yaml 生成 service account key cd /etc/kubernetes/ openssl genrsa -out /etc/kubernetes/sa.key 2048 openssl rsa -in /etc/kubernetes/cert/sa.key -pubout -out /etc/kubernetes/cert/sa.pub ls /etc/kubernetes/pki/sa.* cd $HOME 分发service account key到所有节点 subprocess.call(["ansible k8s -m copy -a ‘src=/etc/kubernetes/sa.key dest=/etc/kubernetes/cert/ force=yes‘"], shell=True) subprocess.call(["ansible k8s -m copy -a ‘src=/etc/kubernetes/sa.pub dest=/etc/kubernetes/cert/ force=yes‘"], shell=True) 创建 kube-apiserver systemd unit 模板文件 source /opt/k8s/bin/environment.sh cat > kube-apiserver.service.template <<EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] ExecStart=/opt/k8s/bin/kube-apiserver \ --enable-admission-plugins=plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota \ --anonymous-auth=false \ --experimental-encryption-provider-config=/etc/kubernetes/encryption-config.yaml \ --advertise-address=##NODE_IP## \\ --bind-address=##NODE_IP## \\ --insecure-port=0 \ --authorization-mode=Node,RBAC \ --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \ --runtime-config=api/all \ --enable-bootstrap-token-auth \ --service-cluster-ip-range=${SERVICE_CIDR} \ --service-node-port-range=${NODE_PORT_RANGE} \ --tls-cert-file=/etc/kubernetes/cert/kubernetes.pem \ --tls-private-key-file=/etc/kubernetes/cert/kubernetes-key.pem \ --client-ca-file=/etc/kubernetes/cert/ca.pem \ --kubelet-client-certificate=/etc/kubernetes/cert/kubernetes.pem \ --kubelet-client-key=/etc/kubernetes/cert/kubernetes-key.pem \ --service-account-key-file=/etc/kubernetes/cert/sa.pub \ --etcd-cafile=/etc/kubernetes/cert/ca.pem \ --etcd-certfile=/etc/kubernetes/cert/kubernetes.pem \ --etcd-keyfile=/etc/kubernetes/cert/kubernetes-key.pem \ --etcd-servers=${ETCD_ENDPOINTS} \ --enable-swagger-ui=true \ --allow-privileged=true \ --apiserver-count=3 \ --audit-log-maxage=30 \ --audit-log-maxbackup=3 \ --audit-log-maxsize=100 \ --audit-log-path=/var/log/kube-apiserver-audit.log \ --event-ttl=1h \ --alsologtostderr=true \ --logtostderr=false \ --log-dir=/var/log/kubernetes \ --v=2 Restart=on-failure RestartSec=5 Type=notify User=k8s LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF --experimental-encryption-provider-config:启用加密特性; --authorization-mode=Node,RBAC: 开启 Node 和 RBAC 授权模式,拒绝未授权的请求; --enable-admission-plugins:启用 ServiceAccount 和 NodeRestriction; --service-account-key-file:签名 ServiceAccount Token 的公钥文件,kube-controller-manager 的 --service-account-private-key-file 指定私钥文件,两者配对使用; --tls-*-file:指定 apiserver 使用的证书、私钥和 CA 文件。--client-ca-file 用于验证 client (kue-controller-manager、kube-scheduler、kubelet、kube-proxy 等)请求所带的证书; --kubelet-client-certificate、--kubelet-client-key:如果指定,则使用 https 访问 kubelet APIs;需要为证书对应的用户(上面 kubernetes*.pem 证书的用户为 kubernetes) 用户定义 RBAC 规则,否则访问 kubelet API 时提示未授权; --bind-address: 不能为 127.0.0.1,否则外界不能访问它的安全端口 6443; --insecure-port=0:关闭监听非安全端口(8080); --service-cluster-ip-range: 指定 Service Cluster IP 地址段; --service-node-port-range: 指定 NodePort 的端口范围; --runtime-config=api/all=true: 启用所有版本的 APIs,如 autoscaling/v2alpha1; --enable-bootstrap-token-auth:启用 kubelet bootstrap 的 token 认证; --apiserver-count=3:指定集群运行模式,多台 kube-apiserver 会通过 leader 选举产生一个工作节点,其它节点处于阻塞状态; User=k8s:使用 k8s 账户运行; 为各节点创建和分发 kube-apiserver systemd unit 文件 替换模板文件中的变量,为各节点创建 systemd unit 文件: source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-apiserver.service.template > kube-apiserver-${NODE_IPS[i]}.service done ls kube-apiserver*.service NODE_NAMES 和 NODE_IPS 为相同长度的 bash 数组,分别为节点名称和对应的 IP; 分发生成的 systemd unit 文件: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" scp kube-apiserver-${node_ip}.service [email protected]${node_ip}:/etc/systemd/system/kube-apiserver.service done 必须先创建日志目录; 文件重命名为 kube-apiserver.service; 替换后的 unit 文件:kube-apiserver.service 启动 kube-apiserver 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-apiserver" done 检查 kube-apiserver 运行状态 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl status kube-apiserver |grep ‘Active:‘" done 报错:如果找不到kube-controller-manager.kubeconfig文件会报如下错误: [[email protected] ~]# journalctl -u kube-controller-manager -- Logs begin at Mon 2019-02-04 17:56:47 EST, end at Tue 2019-02-05 01:04:33 EST. -- Feb 04 23:58:13 test1 systemd[1]: [/etc/systemd/system/kube-controller-manager.service:7] Failed to parse service restart specifier, Feb 04 23:58:13 test1 systemd[1]: [/etc/systemd/system/kube-controller-manager.service:7] Failed to parse service restart specifier, Feb 04 23:58:14 test1 kube-controller-manager[45817]: Flag --port has been deprecated, see --secure-port instead. Feb 04 23:58:14 test1 kube-controller-manager[45817]: Flag --horizontal-pod-autoscaler-use-rest-clients has been deprecated, Heapster Feb 04 23:58:14 test1 kube-controller-manager[45817]: I0204 23:58:14.297286 45817 flags.go:33] FLAG: --address="0.0.0.0" 确保状态为 active (running),否则到 master 节点查看日志,确认原因: journalctl -u kube-apiserver 打印 kube-apiserver 写入 etcd 的数据 source /opt/k8s/bin/environment.sh ETCDCTL_API=3 etcdctl --endpoints=${ETCD_ENDPOINTS} --cacert=/etc/kubernetes/cert/ca.pem --cert=/etc/etcd/cert/etcd.pem --key=/etc/etcd/cert/etcd-key.pem get /registry/ --prefix --keys-only 检查集群信息 kubectl cluster-info Kubernetes master is running at https://192.168.0.235:8443 To further debug and diagnose cluster problems, use ‘kubectl cluster-info dump‘. kubectl get all --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 35m kubectl get componentstatuses NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: getsockopt: connection refused scheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: getsockopt: connection refused etcd-1 Healthy {"health":"true"} etcd-0 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"} 注意: 如果执行 kubectl 命令式时输出如下错误信息,则说明使用的 ~/.kube/config 文件不对,请切换到正确的账户后再执行该命令: The connection to the server localhost:8080 was refused - did you specify the right host or port? 执行 kubectl get componentstatuses 命令时,apiserver 默认向 127.0.0.1 发送请求。当 controller-manager、scheduler 以集群模式运行时,有可能和 kube-apiserver 不在一台机器上,这时 controller-manager 或 scheduler 的状态为 Unhealthy,但实际上它们工作正常。 检查 kube-apiserver 监听的端口 sudo netstat -lnpt|grep kube tcp 0 0 192.168.0.91:6443 0.0.0.0:* LISTEN 13075/kube-apiserve 6443: 接收 https 请求的安全端口,对所有请求做认证和授权; 由于关闭了非安全端口,故没有监听 8080; 授予 kubernetes 证书访问 kubelet API 的权限 (按照这个做完后感觉没这一步没什么用,还得创建高级权限,创建权限在启动kubelet前创建) kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernete 在执行 kubectl exec、run、logs 等命令时,apiserver 会转发到 kubelet。这里定义 RBAC 规则,授权 apiserver 调用 kubelet API。 06-2.部署高可用 kube-controller-manager 集群 本文档介绍部署高可用 kube-controller-manager 集群的步骤。 该集群包含 3 个节点,启动后将通过竞争选举机制产生一个 leader 节点,其它节点为阻塞状态。当 leader 节点不可用后,剩余节点将再次进行选举产生新的 leader 节点,从而保证服务的可用性。 为保证通信安全,本文档先生成 x509 证书和私钥,kube-controller-manager 在如下两种情况下使用该证书: 与 kube-apiserver 的安全端口通信时; 在安全端口(https,10252) 输出 prometheus 格式的 metrics; 准备工作 下载最新版本的二进制文件、安装和配置 flanneld 参考:06-0.部署master节点.md 创建 kube-controller-manager 证书和私钥 创建证书签名请求: cat > kube-controller-manager-csr.json <<EOF { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 2048 }, "hosts": [ "127.0.0.1", "192.168.0.91", "192.168.0.92", "192.168.0.93" ], "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kube-controller-manager", "OU": "4Paradigm" } ] } EOF hosts 列表包含所有 kube-controller-manager 节点 IP; CN 为 system:kube-controller-manager、O 为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予 kube-controller-manager 工作所需的权限。 生成证书和私钥: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager 将生成的证书和私钥分发到所有 master 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-controller-manager*.pem [email protected]${node_ip}:/etc/kubernetes/cert/ done 创建和分发 kubeconfig 文件 kubeconfig 文件包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书; source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-credentials system:kube-controller-manager --client-certificate=kube-controller-manager.pem --client-key=kube-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig kubectl config set-context system:kube-controller-manager --cluster=kubernetes --user=system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig 分发 kubeconfig 到所有 master 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-controller-manager.kubeconfig [email protected]${node_ip}:/etc/kubernetes/ done 创建和分发 kube-controller-manager systemd unit 文件 source /opt/k8s/bin/environment.sh cat > kube-controller-manager.service <<EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] ExecStart=/opt/k8s/bin/kube-controller-manager \ --port=0 \ --secure-port=10252 \ --bind-address=127.0.0.1 \ --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \ --service-cluster-ip-range=${SERVICE_CIDR} \ --allocate-node-cidrs=true \ --cluster-cidr=${CLUSTER_CIDR} \ --cluster-name=kubernetes \ --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \ --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \ --experimental-cluster-signing-duration=8760h \ --root-ca-file=/etc/kubernetes/cert/ca.pem \ --service-account-private-key-file=/etc/kubernetes/cert/sa.key \ --leader-elect=true \ --feature-gates=RotateKubeletServerCertificate=true \ --controllers=*,bootstrapsigner,tokencleaner \ --horizontal-pod-autoscaler-use-rest-clients=true \ --horizontal-pod-autoscaler-sync-period=10s \ --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \ --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \ --use-service-account-credentials=true \ --alsologtostderr=true \ --logtostderr=false \ --log-dir=/var/log/kubernetes \ --v=2 Restart=on Restart=on-failure RestartSec=5 User=k8s [Install] WantedBy=multi-user.target EOF --port=0:关闭监听 http /metrics 的请求,同时 --address 参数无效,--bind-address 参数有效; --secure-port=10252、--bind-address=0.0.0.0: 在所有网络接口监听 10252 端口的 https /metrics 请求; --kubeconfig:指定 kubeconfig 文件路径,kube-controller-manager 使用它连接和验证 kube-apiserver; --cluster-signing-*-file:签名 TLS Bootstrap 创建的证书; --experimental-cluster-signing-duration:指定 TLS Bootstrap 证书的有效期; --root-ca-file:放置到容器 ServiceAccount 中的 CA 证书,用来对 kube-apiserver 的证书进行校验; --service-account-private-key-file:签名 ServiceAccount 中 Token 的私钥文件,必须和 kube-apiserver 的 --service-account-key-file 指定的公钥文件配对使用; --service-cluster-ip-range :指定 Service Cluster IP 网段,必须和 kube-apiserver 中的同名参数一致; --leader-elect=true:集群运行模式,启用选举功能;被选为 leader 的节点负责处理工作,其它节点为阻塞状态; --feature-gates=RotateKubeletServerCertificate=true:开启 kublet server 证书的自动更新特性; --controllers=*,bootstrapsigner,tokencleaner:启用的控制器列表,tokencleaner 用于自动清理过期的 Bootstrap token; --horizontal-pod-autoscaler-*:custom metrics 相关参数,支持 autoscaling/v2alpha1; --tls-cert-file、--tls-private-key-file:使用 https 输出 metrics 时使用的 Server 证书和秘钥; --use-service-account-credentials=true: User=k8s:使用 k8s 账户运行; kube-controller-manager 不对请求 https metrics 的 Client 证书进行校验,故不需要指定 --tls-ca-file 参数,而且该参数已被淘汰。 分发 systemd unit 文件到所有 master 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-controller-manager.service [email protected]${node_ip}:/etc/systemd/system/ done kube-controller-manager 的权限 ClusteRole: system:kube-controller-manager 的权限很小,只能创建 secret、serviceaccount 等资源对象,各 controller 的权限分散到 ClusterRole system:controller:XXX 中。 需要在 kube-controller-manager 的启动参数中添加 --use-service-account-credentials=true 参数,这样 main controller 会为各 controller 创建对应的 ServiceAccount XXX-controller。 内置的 ClusterRoleBinding system:controller:XXX 将赋予各 XXX-controller ServiceAccount 对应的 ClusterRole system:controller:XXX 权限。 启动 kube-controller-manager 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" ssh [email protected]${node_ip} "systemctl daemon-reload && systemctl enable kube-controller-manager && systemctl restart kube-controller-manager" done 必须先创建日志目录; 检查服务运行状态 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl status kube-controller-manager|grep Active" done 确保状态为 active (running),否则查看日志,确认原因: journalctl -u kube-controller-manager 查看输出的 metric 注意:以下命令在 kube-controller-manager 节点上执行。 kube-controller-manager 监听 10252 端口,接收 https 请求: sudo netstat -lnpt|grep kube-controll tcp 0 0 127.0.0.1:10252 0.0.0.0:* LISTEN 18377/kube-controll curl -s --cacert /etc/kubernetes/cert/ca.pem https://127.0.0.1:10252/metrics |head # HELP ClusterRoleAggregator_adds Total number of adds handled by workqueue: ClusterRoleAggregator # TYPE ClusterRoleAggregator_adds counter ClusterRoleAggregator_adds 3 # HELP ClusterRoleAggregator_depth Current depth of workqueue: ClusterRoleAggregator # TYPE ClusterRoleAggregator_depth gauge ClusterRoleAggregator_depth 0 # HELP ClusterRoleAggregator_queue_latency How long an item stays in workqueueClusterRoleAggregator before being requested. # TYPE ClusterRoleAggregator_queue_latency summary ClusterRoleAggregator_queue_latency{quantile="0.5"} 57018 ClusterRoleAggregator_queue_latency{quantile="0.9"} 57268 curl --cacert CA 证书用来验证 kube-controller-manager https server 证书; 测试 kube-controller-manager 集群的高可用 停掉一个或两个节点的 kube-controller-manager 服务,观察其它节点的日志,看是否获取了 leader 权限。 查看当前的 leader kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: ‘{"holderIdentity":"test2_084534e2-6cc4-11e8-a418-5254001f5b65","leaseDurationSeconds":15,"acquireTime":"2018-06-10T15:40:33Z","renewTime":"2018-06-10T16:19:08Z","leaderTransitions":12}‘ creationTimestamp: 2018-06-10T13:59:42Z name: kube-controller-manager namespace: kube-system resourceVersion: "4540" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager uid: 862cc048-6cb6-11e8-96fa-525400ba84c6 可见,当前的 leader 为 test2 节点。 06-3.部署高可用 kube-scheduler 集群 本文档介绍部署高可用 kube-scheduler 集群的步骤。 该集群包含 3 个节点,启动后将通过竞争选举机制产生一个 leader 节点,其它节点为阻塞状态。当 leader 节点不可用后,剩余节点将再次进行选举产生新的 leader 节点,从而保证服务的可用性。 为保证通信安全,本文档先生成 x509 证书和私钥,kube-scheduler 在如下两种情况下使用该证书: 与 kube-apiserver 的安全端口通信; 在安全端口(https,10251) 输出 prometheus 格式的 metrics; 准备工作 下载最新版本的二进制文件、安装和配置 flanneld 参考:06-0.部署master节点.md 创建 kube-scheduler 证书和私钥 创建证书签名请求: cat > kube-scheduler-csr.json <<EOF { "CN": "system:kube-scheduler", "hosts": [ "127.0.0.1", "192.168.0.91", "192.168.0.92", "192.168.0.93" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "system:kube-scheduler", "OU": "4Paradigm" } ] } EOF hosts 列表包含所有 kube-scheduler 节点 IP; CN 为 system:kube-scheduler、O 为 system:kube-scheduler,kubernetes 内置的 ClusterRoleBindings system:kube-scheduler 将赋予 kube-scheduler 工作所需的权限。 生成证书和私钥: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler 创建和分发 kubeconfig 文件 kubeconfig 文件包含访问 apiserver 的所有信息,如 apiserver 地址、CA 证书和自身使用的证书; source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kube-scheduler.kubeconfig kubectl config set-credentials system:kube-scheduler --client-certificate=kube-scheduler.pem --client-key=kube-scheduler-key.pem --embed-certs=true --kubeconfig=kube-scheduler.kubeconfig kubectl config set-context system:kube-scheduler --cluster=kubernetes --user=system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig 上一步创建的证书、私钥以及 kube-apiserver 地址被写入到 kubeconfig 文件中; 分发 kubeconfig 到所有 master 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-scheduler.kubeconfig [email protected]${node_ip}:/etc/kubernetes/ done 创建和分发 kube-scheduler systemd unit 文件 cat > kube-scheduler.service <<EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/GoogleCloudPlatform/kubernetes [Service] ExecStart=/opt/k8s/bin/kube-scheduler \ --address=127.0.0.1 \ --kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \ --leader-elect=true \ --alsologtostderr=true \ --logtostderr=false \ --log-dir=/var/log/kubernetes \ --v=2 Restart=on-failure RestartSec=5 User=k8s [Install] WantedBy=multi-user.target EOF --address:在 127.0.0.1:10251 端口接收 http /metrics 请求;kube-scheduler 目前还不支持接收 https 请求; --kubeconfig:指定 kubeconfig 文件路径,kube-scheduler 使用它连接和验证 kube-apiserver; --leader-elect=true:集群运行模式,启用选举功能;被选为 leader 的节点负责处理工作,其它节点为阻塞状态; User=k8s:使用 k8s 账户运行; 完整 unit 见 kube-scheduler.service。 分发 systemd unit 文件到所有 master 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp kube-scheduler.service [email protected]${node_ip}:/etc/systemd/system/ done 启动 kube-scheduler 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" ssh [email protected]${node_ip} "systemctl daemon-reload && systemctl enable kube-scheduler && systemctl restart kube-scheduler" done 必须先创建日志目录; 检查服务运行状态 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl status kube-scheduler|grep Active" done 确保状态为 active (running),否则查看日志,确认原因: journalctl -u kube-scheduler 查看输出的 metric 注意:以下命令在 kube-scheduler 节点上执行。 kube-scheduler 监听 10251 端口,接收 http 请求: sudo netstat -lnpt|grep kube-sche tcp 0 0 127.0.0.1:10251 0.0.0.0:* LISTEN 23783/kube-schedule curl -s http://127.0.0.1:10251/metrics |head # HELP apiserver_audit_event_total Counter of audit events generated and sent to the audit backend. # TYPE apiserver_audit_event_total counter apiserver_audit_event_total 0 # HELP go_gc_duration_seconds A summary of the GC invocation durations. # TYPE go_gc_duration_seconds summary go_gc_duration_seconds{quantile="0"} 9.7715e-05 go_gc_duration_seconds{quantile="0.25"} 0.000107676 go_gc_duration_seconds{quantile="0.5"} 0.00017868 go_gc_duration_seconds{quantile="0.75"} 0.000262444 go_gc_duration_seconds{quantile="1"} 0.001205223 测试 kube-scheduler 集群的高可用 随便找一个或两个 master 节点,停掉 kube-scheduler 服务,看其它节点是否获取了 leader 权限(systemd 日志)。 查看当前的 leader kubectl get endpoints kube-scheduler --namespace=kube-system -o yaml apiVersion: v1 kind: Endpoints metadata: annotations: control-plane.alpha.kubernetes.io/leader: ‘{"holderIdentity":"test3_61f34593-6cc8-11e8-8af7-5254002f288e","leaseDurationSeconds":15,"acquireTime":"2018-06-10T16:09:56Z","renewTime":"2018-06-10T16:20:54Z","leaderTransitions":1}‘ creationTimestamp: 2018-06-10T16:07:33Z name: kube-scheduler namespace: kube-system resourceVersion: "4645" selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler uid: 62382d98-6cc8-11e8-96fa-525400ba84c6 可见,当前的 leader 为 test3 节点。 07-0.部署 worker 节点 kubernetes work 节点运行如下组件: docker kubelet kube-proxy 安装和配置 flanneld 参考 05-部署flannel网络.md 安装依赖包 CentOS: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "yum install -y epel-release" ssh [email protected]${node_ip} "yum install -y conntrack ipvsadm ipset jq iptables curl sysstat libseccomp && /usr/sbin/modprobe ip_vs " done Ubuntu: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "apt-get install -y conntrack ipvsadm ipset jq iptables curl sysstat libseccomp && /usr/sbin/modprobe ip_vs " done 07-1.部署 docker 组件 docker 是容器的运行环境,管理它的生命周期。kubelet 通过 Container Runtime Interface (CRI) 与 docker 进行交互。 安装依赖包 参考 07-0.部署worker节点.md 下载和分发 docker 二进制文件 到 https://download.docker.com/linux/static/stable/x86_64/ 页面下载最新发布包: wget https://download.docker.com/linux/static/stable/x86_64/docker-18.03.1-ce.tgz tar -xvf docker-18.03.1-ce.tgz 分发二进制文件到所有 worker 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp docker/docker* [email protected]${node_ip}:/opt/k8s/bin/ ssh [email protected]${node_ip} "chmod +x /opt/k8s/bin/*" done 创建和分发 systemd unit 文件 cat > docker.service <<"EOF" [Unit] Description=Docker Application Container Engine Documentation=http://docs.docker.io [Service] Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin" EnvironmentFile=-/run/flannel/docker ExecStart=/opt/k8s/bin/dockerd --log-level=error $DOCKER_NETWORK_OPTIONS ExecReload=/bin/kill -s HUP $MAINPID Restart=on-failure RestartSec=5 LimitNOFILE=infinity LimitNPROC=infinity LimitCORE=infinity Delegate=yes KillMode=process [Install] WantedBy=multi-user.target EOF EOF 前后有双引号,这样 bash 不会替换文档中的变量,如 $DOCKER_NETWORK_OPTIONS; dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中; flanneld 启动时将网络配置写入 /run/flannel/docker 文件中,dockerd 启动前读取该文件中的环境变量 DOCKER_NETWORK_OPTIONS ,然后设置 docker0 网桥网段; 如果指定了多个 EnvironmentFile 选项,则必须将 /run/flannel/docker 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数); docker 需要以 root 用于运行; docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT: sudo iptables -P FORWARD ACCEPT 并且把以下命令写入 /etc/rc.local 文件中,防止节点重启iptables FORWARD chain的默认策略又还原为DROP echo /sbin/iptables -P FORWARD ACCEPT >/etc/profile source /etc/profile 分发 systemd unit 文件到所有 worker 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" scp docker.service [email protected]${node_ip}:/etc/systemd/system/docker.service done 配置和分发 docker 配置文件 配置docker镜像加速 (需要重启 dockerd 生效): cat > docker-daemon.json <<EOF { "registry-mirrors": ["https://hub-mirror.c.163.com", "https://docker.mirrors.ustc.edu.cn"], "max-concurrent-downloads": 20 } EOF 分发 docker 配置文件到所有 work 节点: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /etc/docker/" scp docker-daemon.json [email protected]${node_ip}:/etc/docker/daemon.json done 启动 docker 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl stop firewalld && systemctl disable firewalld" ssh [email protected]${node_ip} "/usr/sbin/iptables -F && /usr/sbin/iptables -X && /usr/sbin/iptables -F -t nat && /usr/sbin/iptables -X -t nat" ssh [email protected]${node_ip} "/usr/sbin/iptables -P FORWARD ACCEPT" ssh [email protected]${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker" ssh [email protected]${node_ip} ‘for intf in /sys/devices/virtual/net/docker0/brif/*; do echo 1 > $intf/hairpin_mode; done‘ ssh [email protected]${node_ip} "sudo sysctl -p /etc/sysctl.d/kubernetes.conf" done 关闭 firewalld(centos7)/ufw(ubuntu16.04),否则可能会重复创建 iptables 规则; 清理旧的 iptables rules 和 chains 规则; 开启 docker0 网桥下虚拟网卡的 hairpin 模式; 检查服务运行状态 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "systemctl status docker|grep Active" done 确保状态为 active (running),否则查看日志,确认原因: journalctl -u docker 检查 docker0 网桥 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "/usr/sbin/ip addr show flannel.1 && /usr/sbin/ip addr show docker0" done 确认各 work 节点的 docker0 网桥和 flannel.1 接口的 IP 处于同一个网段中(如下 172.30.39.0 和 172.30.39.1): 3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default link/ether ce:2f:d6:53:e5:f3 brd ff:ff:ff:ff:ff:ff inet 172.30.39.0/32 scope global flannel.1 valid_lft forever preferred_lft forever inet6 fe80::cc2f:d6ff:fe53:e5f3/64 scope link valid_lft forever preferred_lft forever 4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:bf:65:16:5c brd ff:ff:ff:ff:ff:ff inet 172.30.39.1/24 brd 172.30.39.255 scope global docker0 valid_lft forever preferred_lft forever 07-2.部署 kubelet 组件 kublet 运行在每个 worker 节点上,接收 kube-apiserver 发送的请求,管理 Pod 容器,执行交互式命令,如 exec、run、logs 等。 kublet 启动时自动向 kube-apiserver 注册节点信息,内置的 cadvisor 统计和监控节点的资源使用情况。 为确保安全,本文档只开启接收 https 请求的安全端口,对请求进行认证和授权,拒绝未授权的访问(如 apiserver、heapster)。 下载和分发 kubelet 二进制文件 参考 06-0.部署master节点.md 安装依赖包 参考 07-0.部署worker节点.md 创建 kubelet bootstrap kubeconfig 文件 source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" # 创建 token export BOOTSTRAP_TOKEN=$(kubeadm token create --description kubelet-bootstrap-token --groups system:bootstrappers:${node_name} --kubeconfig ~/.kube/config) # 设置集群参数 kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 设置客户端认证参数 kubectl config set-credentials kubelet-bootstrap --token=${BOOTSTRAP_TOKEN} --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 设置上下文参数 kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig # 设置默认上下文 kubectl config use-context default --kubeconfig=kubelet-bootstrap-${node_name}.kubeconfig done 证书中写入 Token 而非证书,证书后续由 controller-manager 创建。 查看 kubeadm 为各节点创建的 token: kubeadm token list --kubeconfig ~/.kube/config TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS k0s2bj.7nvw1zi1nalyz4gz 23h 2018-06-14T15:14:31+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:test1 mkus5s.vilnjk3kutei600l 23h 2018-06-14T15:14:32+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:test3 zkiem5.0m4xhw0jc8r466nk 23h 2018-06-14T15:14:32+08:00 authentication,signing kubelet-bootstrap-token system:bootstrappers:test2 创建的 token 有效期为 1 天,超期后将不能再被使用,且会被 kube-controller-manager 的 tokencleaner 清理(如果启用该 controller 的话); kube-apiserver 接收 kubelet 的 bootstrap token 后,将请求的 user 设置为 system:bootstrap:,group 设置为 system:bootstrappers; 各 token 关联的 Secret: kubectl get secrets -n kube-system NAME TYPE DATA AGE bootstrap-token-k0s2bj bootstrap.kubernetes.io/token 7 1m bootstrap-token-mkus5s bootstrap.kubernetes.io/token 7 1m bootstrap-token-zkiem5 bootstrap.kubernetes.io/token 7 1m default-token-99st7 kubernetes.io/service-account-token 3 2d 分发 bootstrap kubeconfig 文件到所有 worker 节点 source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kubelet-bootstrap-${node_name}.kubeconfig [email protected]${node_name}:/etc/kubernetes/kubelet-bootstrap.kubeconfig done 创建和分发 kubelet 参数配置文件 从 v1.10 开始,kubelet 部分参数需在配置文件中配置,kubelet --help 会提示: DEPRECATED: This parameter should be set via the config file specified by the Kubelet‘s --config flag 创建 kubelet 参数配置模板文件: source /opt/k8s/bin/environment.sh cat > kubelet.config.json.template <<EOF { "kind": "KubeletConfiguration", "apiVersion": "kubelet.config.k8s.io/v1beta1", "authentication": { "x509": { "clientCAFile": "/etc/kubernetes/cert/ca.pem" }, "webhook": { "enabled": true, "cacheTTL": "2m0s" }, "anonymous": { "enabled": false } }, "authorization": { "mode": "Webhook", "webhook": { "cacheAuthorizedTTL": "5m0s", "cacheUnauthorizedTTL": "30s" } }, "address": "##NODE_IP##", "port": 10250, "readOnlyPort": 0, "cgroupDriver": "cgroupfs", "hairpinMode": "promiscuous-bridge", "serializeImagePulls": false, "featureGates": { "RotateKubeletClientCertificate": true, "RotateKubeletServerCertificate": true }, "clusterDomain": "${CLUSTER_DNS_DOMAIN}", "clusterDNS": ["${CLUSTER_DNS_SVC_IP}"] } EOF address:API 监听地址,不能为 127.0.0.1,否则 kube-apiserver、heapster 等不能调用 kubelet 的 API; readOnlyPort=0:关闭只读端口(默认 10255),等效为未指定; authentication.anonymous.enabled:设置为 false,不允许匿名?访问 10250 端口; authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTP 证书认证; authentication.webhook.enabled=true:开启 HTTPs bearer token 认证; 对于未通过 x509 证书和 webhook 认证的请求(kube-apiserver 或其他客户端),将被拒绝,提示 Unauthorized; authroization.mode=Webhook:kubelet 使用 SubjectAccessReview API 查询 kube-apiserver 某 user、group 是否具有操作资源的权限(RBAC); featureGates.RotateKubeletClientCertificate、featureGates.RotateKubeletServerCertificate:自动 rotate 证书,证书的有效期取决于 kube-controller-manager 的 --experimental-cluster-signing-duration 参数; 需要 root 账户运行; 为各节点创建和分发 kubelet 配置文件: source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" sed -e "s/##NODE_IP##/${node_ip}/" kubelet.config.json.template > kubelet.config-${node_ip}.json scp kubelet.config-${node_ip}.json [email protected]${node_ip}:/etc/kubernetes/kubelet.config.json done 替换后的 kubelet.config.json 文件: kubelet.config.json 创建和分发 kubelet systemd unit 文件 创建 kubelet systemd unit 文件模板: cat > kubelet.service.template <<EOF [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=docker.service Requires=docker.service [Service] WorkingDirectory=/var/lib/kubelet ExecStart=/opt/k8s/bin/kubelet \ --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \ --cert-dir=/etc/kubernetes/cert \ --network-plugin=cni \ --cni-bin-dir=/opt/cni/bin \ --cni-conf-dir=/etc/cni/net.d \ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \ --config=/etc/kubernetes/kubelet.config.json \ --hostname-override=##NODE_NAME## \\ --pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest \ --allow-privileged=true \ --alsologtostderr=true \ --logtostderr=false \ --log-dir=/var/log/kubernetes/ \ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF 如果设置了 --hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况; --bootstrap-kubeconfig:指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求; K8S approve kubelet 的 csr 请求后,在 --cert-dir 目录创建证书和私钥文件,然后写入 --kubeconfig 文件; 替换后的 unit 文件:kubelet.service 为各节点创建和分发 kubelet systemd unit 文件: source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" sed -e "s/##NODE_NAME##/${node_name}/" kubelet.service.template > kubelet-${node_name}.service scp kubelet-${node_name}.service [email protected]${node_name}:/etc/systemd/system/kubelet.service done Bootstrap Token Auth 和授予权限 kublet 启动时查找配置的 --kubeletconfig 文件是否存在,如果不存在则使用 --bootstrap-kubeconfig 向 kube-apiserver 发送证书签名请求 (CSR)。 kube-apiserver 收到 CSR 请求后,对其中的 Token 进行认证(事先使用 kubeadm 创建的 token),认证通过后将请求的 user 设置为 system:bootstrap:,group 设置为 system:bootstrappers,这一过程称为 Bootstrap Token Auth。 默认情况下,这个 user 和 group 没有创建 CSR 的权限,kubelet 启动失败,错误日志如下: sudo journalctl -u kubelet -a |grep -A 2 ‘certificatesigningrequests‘ May 06 06:42:36 test1 kubelet[26986]: F0506 06:42:36.314378 26986 server.go:233] failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:bootstrap:lemy40" cannot create certificatesigningrequests.certificates.k8s.io at the cluster scope May 06 06:42:36 test1 systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a May 06 06:42:36 test1 systemd[1]: kubelet.service: Failed with result ‘exit-code‘. 解决办法是:创建一个 clusterrolebinding,将 group system:bootstrappers 和 clusterrole system:node-bootstrapper 绑定: kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers 给kubelet授权高级权限: 否则无法通过kubectl exec 进入一个pod cat > apiserver-to-kubelet.yaml <<EOF apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kubernetes-to-kubelet rules: - apiGroups: - "" resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics verbs: - "*" --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kubernetes namespace: "" roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kubernetes-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kubernetes EOF 创建授权: kubectl create -f apiserver-to-kubelet.yaml [[email protected] ~]# kubectl create -f apiserver-to-kubelet.yaml clusterrole.rbac.authorization.k8s.io/system:kubernetes-to-kubelet created clusterrolebinding.rbac.authorization.k8s.io/system:kubernetes created 重新进到容器查看资源 [[email protected] ~]# kubectl exec -it http-test-dm2-6dbd76c7dd-cv9qf sh / # exit 现在可以进到容器里面查看资源了 这是之前实验报错后解决的结果贴到这里了,现在还没有pod,所以无法操作这一步 启动 kubelet 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /var/lib/kubelet" ssh [email protected]${node_ip} "/usr/sbin/swapoff -a" ssh [email protected]${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" ssh [email protected]${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet" done 关闭 swap 分区,否则 kubelet 会启动失败; 必须先创建工作和日志目录; 查看日志 journalctl -u kubelet |tail Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.388242 22343 feature_gate.go:226] feature gates: &{{} map[RotateKubeletServerCertificate:true RotateKubeletClientCertificate:true]} Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.394342 22343 mount_linux.go:211] Detected OS with systemd Jun 13 16:05:40 test2 kubelet[22343]: W0613 16:05:40.394494 22343 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399508 22343 server.go:376] Version: v1.10.4 Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399583 22343 feature_gate.go:226] feature gates: &{{} map[RotateKubeletServerCertificate:true RotateKubeletClientCertificate:true]} Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399736 22343 plugins.go:89] No cloud provider specified. Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399752 22343 server.go:492] No cloud provider specified: "" from the config file: "" Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.399777 22343 bootstrap.go:58] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.446068 22343 csr.go:105] csr for this node already exists, reusing Jun 13 16:05:40 test2 kubelet[22343]: I0613 16:05:40.453761 22343 csr.go:113] csr for this node is still valid kubelet 启动后使用 --bootstrap-kubeconfig 向 kube-apiserver 发送 CSR 请求,当这个 CSR 被 approve 后,kube-controller-manager 为 kubelet 创建 TLS 客户端证书、私钥和 --kubeletconfig 文件。 注意:kube-controller-manager 需要配置 --cluster-signing-cert-file 和 --cluster-signing-key-file 参数,才会为 TLS Bootstrap 创建证书和私钥。 查看csr kubectl get csr NAME AGE REQUESTOR CONDITION node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk 43s system:bootstrap:zkiem5 Pending node-csr-oVbPmU-ikVknpynwu0Ckz_MvkAO_F1j0hmbcDa__sGA 27s system:bootstrap:mkus5s Pending node-csr-u0E1-ugxgotO_9FiGXo8DkD6a7-ew8sX2qPE6KPS2IY 13m system:bootstrap:k0s2bj Pending kubectl get nodes No resources found. 三个 work 节点的 csr 均处于 pending 状态; approve kubelet CSR 请求 可以手动或自动 approve CSR 请求。推荐使用自动的方式,因为从 v1.8 版本开始,可以自动轮转approve csr 后生成的证书。 方式、手动 approve CSR 请求 查看 CSR 列表: kubectl get csr NAME AGE REQUESTOR CONDITION node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk 43s system:bootstrap:zkiem5 Pending node-csr-oVbPmU-ikVknpynwu0Ckz_MvkAO_F1j0hmbcDa__sGA 27s system:bootstrap:mkus5s Pending node-csr-u0E1-ugxgotO_9FiGXo8DkD6a7-ew8sX2qPE6KPS2IY 13m system:bootstrap:k0s2bj Pending approve CSR: kubectl certificate approve node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk certificatesigningrequest.certificates.k8s.io "node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk" approved 查看 Approve 结果: kubectl describe csr node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk Name: node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk Labels: <none> Annotations: <none> CreationTimestamp: Wed, 13 Jun 2018 16:05:04 +0800 Requesting User: system:bootstrap:zkiem5 Status: Approved Subject: Common Name: system:node:test2 Serial Number: Organization: system:nodes Events: <none> Requesting User:请求 CSR 的用户,kube-apiserver 对它进行认证和授权; Subject:请求签名的证书信息; 证书的 CN 是 system:node:test2, Organization 是 system:nodes,kube-apiserver 的 Node 授权模式会授予该证书的相关权限; 方式二、自动 approve CSR 请求 创建三个 ClusterRoleBinding,分别用于自动 approve client、renew client、renew server 证书: cat > csr-crb.yaml <<EOF # Approve all CSRs for the group "system:bootstrappers" kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: auto-approve-csrs-for-group subjects: - kind: Group name: system:bootstrappers apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:nodeclient apiGroup: rbac.authorization.k8s.io --- # To let a node of the group "system:nodes" renew its own credentials kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: node-client-cert-renewal subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient apiGroup: rbac.authorization.k8s.io --- # A ClusterRole which instructs the CSR approver to approve a node requesting a # serving cert matching its client cert. kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: approve-node-server-renewal-csr rules: - apiGroups: ["certificates.k8s.io"] resources: ["certificatesigningrequests/selfnodeserver"] verbs: ["create"] --- # To let a node of the group "system:nodes" renew its own server credentials kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: node-server-cert-renewal subjects: - kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: approve-node-server-renewal-csr apiGroup: rbac.authorization.k8s.io EOF auto-approve-csrs-for-group:自动 approve node 的第一次 CSR; 注意第一次 CSR 时,请求的 Group 为 system:bootstrappers; node-client-cert-renewal:自动 approve node 后续过期的 client 证书,自动生成的证书 Group 为 system:nodes; node-server-cert-renewal:自动 approve node 后续过期的 server 证书,自动生成的证书 Group 为 system:nodes; 生效配置: kubectl apply -f csr-crb.yaml 查看 kublet 的情况 等待一段时间(1-10 分钟),三个节点的 CSR 都被自动 approve: kubectl get csr NAME AGE REQUESTOR CONDITION csr-98h25 6m system:node:test2 Approved,Issued csr-lb5c9 7m system:node:test3 Approved,Issued csr-m2hn4 14m system:node:test1 Approved,Issued node-csr-7q7i0q4MF_K2TSEJj16At4CJFLlJkHIqei6nMIAaJCU 28m system:bootstrap:k0s2bj Approved,Issued node-csr-ND77wk2P8k2lHBtgBaObiyYw0uz1Um7g2pRvveMF-c4 35m system:bootstrap:mkus5s Approved,Issued node-csr-Nysmrw55nnM48NKwEJuiuCGmZoxouK4N8jiEHBtLQso 6m system:bootstrap:zkiem5 Approved,Issued node-csr-QzuuQiuUfcSdp3j5W4B2UOuvQ_n9aTNHAlrLzVFiqrk 1h system:bootstrap:zkiem5 Approved,Issued node-csr-oVbPmU-ikVknpynwu0Ckz_MvkAO_F1j0hmbcDa__sGA 1h system:bootstrap:mkus5s Approved,Issued node-csr-u0E1-ugxgotO_9FiGXo8DkD6a7-ew8sX2qPE6KPS2IY 1h system:bootstrap:k0s2bj Approved,Issued 所有节点均 ready: kubectl get nodes NAME STATUS ROLES AGE VERSION test1 Ready <none> 18m v1.10.4 test2 Ready <none> 10m v1.10.4 test3 Ready <none> 11m v1.10.4 查看kube-controller-manager 为各 node 生成了 kubeconfig 文件和公私钥: ls -l /etc/kubernetes/kubelet.kubeconfig -rw------- 1 root root 2293 Jun 13 17:07 /etc/kubernetes/kubelet.kubeconfig ls -l /etc/kubernetes/cert/|grep kubelet -rw-r--r-- 1 root root 1046 Jun 13 17:07 kubelet-client.crt -rw------- 1 root root 227 Jun 13 17:07 kubelet-client.key -rw------- 1 root root 1334 Jun 13 17:07 kubelet-server-2018-06-13-17-07-45.pem lrwxrwxrwx 1 root root 58 Jun 13 17:07 kubelet-server-current.pem -> /etc/kubernetes/cert/kubelet-server-2018-06-13-17-07-45.pem kubelet-server 证书会周期轮转; 查看kubelet开启的端口(亲测实验1.13.0版本看不到4194端口,加上后kubelet就无法启动) kubelet 提供的 API 接口 kublet 启动后监听多个端口,用于接收 kube-apiserver 或其它组件发送的请求: sudo netstat -lnpt|grep kubelet tcp 0 0 192.168.0.92:4194 0.0.0.0:* LISTEN 2490/kubelet tcp 0 0 127.0.0.1:10248 0.0.0.0:* LISTEN 2490/kubelet tcp 0 0 192.168.0.92:10250 0.0.0.0:* LISTEN 2490/kubelet 4194: cadvisor http 服务; 10248: healthz http 服务; 10250: https API 服务;注意:未开启只读端口 10255; 例如执行 kubectl ec -it nginx-ds-5rmws -- sh 命令时,kube-apiserver 会向 kubelet 发送如下请求: POST /exec/default/nginx-ds-5rmws/my-nginx?command=sh&input=1&output=1&tty=1 kubelet 接收 10250 端口的 https 请求: /pods、/runningpods /metrics、/metrics/cadvisor、/metrics/probes /spec /stats、/stats/container /logs /run/、"/exec/", "/attach/", "/portForward/", "/containerLogs/" 等管理; 详情参考:https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/server/server.go#L434:3 由于关闭了匿名认证,同时开启了 webhook 授权,所有访问 10250 端口 https API 的请求都需要被认证和授权。 预定义的 ClusterRole system:kubelet-api-admin 授予访问 kubelet 所有 API 的权限: kubectl describe clusterrole system:kubelet-api-admin Name: system:kubelet-api-admin Labels: kubernetes.io/bootstrapping=rbac-defaults Annotations: rbac.authorization.kubernetes.io/autoupdate=true PolicyRule: Resources Non-Resource URLs Resource Names Verbs --------- ----------------- -------------- ----- nodes [] [] [get list watch proxy] nodes/log [] [] [*] nodes/metrics [] [] [*] nodes/proxy [] [] [*] nodes/spec [] [] [*] nodes/stats [] [] [*] kublet api 认证和授权 kublet 配置了如下认证参数: authentication.anonymous.enabled:设置为 false,不允许匿名?访问 10250 端口; authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTPs 证书认证; authentication.webhook.enabled=true:开启 HTTPs bearer token 认证; 同时配置了如下授权参数: authroization.mode=Webhook:开启 RBAC 授权; kubelet 收到请求后,使用 clientCAFile 对证书签名进行认证,或者查询 bearer token 是否有效。如果两者都没通过,则拒绝请求,提示 Unauthorized: curl -s --cacert /etc/kubernetes/cert/ca.pem https://192.168.0.92:10250/metrics Unauthorized curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer 123456" https://192.168.0.92:10250/metrics Unauthorized 通过认证后,kubelet 使用 SubjectAccessReview API 向 kube-apiserver 发送请求,查询证书或 token 对应的 user、group 是否有操作资源的权限(RBAC); 证书认证和授权: 权限不足的证书; curl -s --cacert /etc/kubernetes/cert/ca.pem --cert /etc/kubernetes/cert/kube-controller-manager.pem --key /etc/kubernetes/cert/kube-controller-manager-key.pem https://192.168.0.92:10250/metrics Forbidden (user=system:kube-controller-manager, verb=get, resource=nodes, subresource=metrics) 使用部署 kubectl 命令行工具时创建的、具有最高权限的 admin 证书; curl -s --cacert /etc/kubernetes/cert/ca.pem --cert ./admin.pem --key ./admin-key.pem https://192.168.0.92:10250/metrics|head # HELP apiserver_client_certificate_expiration_seconds Distribution of the remaining lifetime on the certificate used to authenticate a request. # TYPE apiserver_client_certificate_expiration_seconds histogram apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="21600"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="43200"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="86400"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="172800"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="345600"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="604800"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="2.592e+06"} 0 --cacert、--cert、--key 的参数值必须是文件路径,如上面的 ./admin.pem 不能省略 ./,否则返回 401 Unauthorized; bear token 认证和授权: 创建一个 ServiceAccount,将它和 ClusterRole system:kubelet-api-admin 绑定,从而具有调用 kubelet API 的权限: kubectl create sa kubelet-api-test kubectl create clusterrolebinding kubelet-api-test --clusterrole=system:kubelet-api-admin --serviceaccount=default:kubelet-api-test SECRET=$(kubectl get secrets | grep kubelet-api-test | awk ‘{print $1}‘) TOKEN=$(kubectl describe secret ${SECRET} | grep -E ‘^token‘ | awk ‘{print $2}‘) echo ${TOKEN} curl -s --cacert /etc/kubernetes/cert/ca.pem -H "Authorization: Bearer ${TOKEN}" https://192.168.0.92:10250/metrics|head # HELP apiserver_client_certificate_expiration_seconds Distribution of the remaining lifetime on the certificate used to authenticate a request. # TYPE apiserver_client_certificate_expiration_seconds histogram apiserver_client_certificate_expiration_seconds_bucket{le="0"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="21600"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="43200"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="86400"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="172800"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="345600"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="604800"} 0 apiserver_client_certificate_expiration_seconds_bucket{le="2.592e+06"} 0 cadvisor 和 metrics cadvisor 统计?所在节点各容器的资源(CPU、内存、磁盘、网卡)使用情况,分别在自己的 http web 页面(4194 端口)和 10250 以 promehteus metrics 的形式输出。 浏览器访问 http://192.168.0.91:4194/containers/ 可以查看到 cadvisor 的监控页面: 浏览器访问 https://172.27.129.80:10250/metrics 和 https://172.27.129.80:10250/metrics/cadvisor 分别返回 kublet 和 cadvisor 的 metrics。 注意: kublet.config.json 设置 authentication.anonymous.enabled 为 false,不允许匿名证书访问 10250 的 https 服务; 参考A.浏览器访问kube-apiserver安全端口.md,创建和导入相关证书,然后访问上面的 10250 端口; 获取 kublet 的配置 从 kube-apiserver 获取各 node 的配置: source /opt/k8s/bin/environment.sh 使用部署 kubectl 命令行工具时创建的、具有最高权限的 admin 证书; curl -sSL --cacert /etc/kubernetes/cert/ca.pem --cert ./admin.pem --key ./admin-key.pem ${KUBE_APISERVER}/api/v1/nodes/test1/proxy/configz | jq ‘.kubeletconfig|.kind="KubeletConfiguration"|.apiVersion="kubelet.config.k8s.io/v1beta1"‘ { "syncFrequency": "1m0s", "fileCheckFrequency": "20s", "httpCheckFrequency": "20s", "address": "172.27.129.80", "port": 10250, "readOnlyPort": 10255, "authentication": { "x509": {}, "webhook": { "enabled": false, "cacheTTL": "2m0s" }, "anonymous": { "enabled": true } }, "authorization": { "mode": "AlwaysAllow", "webhook": { "cacheAuthorizedTTL": "5m0s", "cacheUnauthorizedTTL": "30s" } }, "registryPullQPS": 5, "registryBurst": 10, "eventRecordQPS": 5, "eventBurst": 10, "enableDebuggingHandlers": true, "healthzPort": 10248, "healthzBindAddress": "127.0.0.1", "oomScoreAdj": -999, "clusterDomain": "cluster.local.", "clusterDNS": [ "10.254.0.2" ], "streamingConnectionIdleTimeout": "4h0m0s", "nodeStatusUpdateFrequency": "10s", "imageMinimumGCAge": "2m0s", "imageGCHighThresholdPercent": 85, "imageGCLowThresholdPercent": 80, "volumeStatsAggPeriod": "1m0s", "cgroupsPerQOS": true, "cgroupDriver": "cgroupfs", "cpuManagerPolicy": "none", "cpuManagerReconcilePeriod": "10s", "runtimeRequestTimeout": "2m0s", "hairpinMode": "promiscuous-bridge", "maxPods": 110, "podPidsLimit": -1, "resolvConf": "/etc/resolv.conf", "cpuCFSQuota": true, "maxOpenFiles": 1000000, "contentType": "application/vnd.kubernetes.protobuf", "kubeAPIQPS": 5, "kubeAPIBurst": 10, "serializeImagePulls": false, "evictionHard": { "imagefs.available": "15%", "memory.available": "100Mi", "nodefs.available": "10%", "nodefs.inodesFree": "5%" }, "evictionPressureTransitionPeriod": "5m0s", "enableControllerAttachDetach": true, "makeIPTablesUtilChains": true, "iptablesMasqueradeBit": 14, "iptablesDropBit": 15, "featureGates": { "RotateKubeletClientCertificate": true, "RotateKubeletServerCertificate": true }, "failSwapOn": true, "containerLogMaxSize": "10Mi", "containerLogMaxFiles": 5, "enforceNodeAllocatable": [ "pods" ], "kind": "KubeletConfiguration", "apiVersion": "kubelet.config.k8s.io/v1beta1" } 07-3.部署 kube-proxy 组件 kube-proxy 运行在所有 worker 节点上,,它监听 apiserver 中 service 和 Endpoint 的变化情况,创建路由规则来进行服务负载均衡。 本文档讲解部署 kube-proxy 的部署,使用 ipvs 模式。 下载和分发 kube-proxy 二进制文件 参考 06-0.部署master节点.md 安装依赖包 各节点需要安装 ipvsadm 和 ipset 命令,加载 ip_vs 内核模块。 参考 07-0.部署worker节点.md 创建 kube-proxy 证书 创建证书签名请求: cat > kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ] } EOF CN:指定该证书的 User 为 system:kube-proxy; 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限; 该证书只会被 kube-proxy 当做 client 证书使用,所以 hosts 字段为空; 生成证书和私钥: cfssl gencert -ca=/etc/kubernetes/cert/ca.pem -ca-key=/etc/kubernetes/cert/ca-key.pem -config=/etc/kubernetes/cert/ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy 创建和分发 kubeconfig 文件 source /opt/k8s/bin/environment.sh kubectl config set-cluster kubernetes --certificate-authority=/etc/kubernetes/cert/ca.pem --embed-certs=true --server=${KUBE_APISERVER} --kubeconfig=kube-proxy.kubeconfig kubectl config set-credentials kube-proxy --client-certificate=kube-proxy.pem --client-key=kube-proxy-key.pem --embed-certs=true --kubeconfig=kube-proxy.kubeconfig kubectl config set-context default --cluster=kubernetes --user=kube-proxy --kubeconfig=kube-proxy.kubeconfig kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig --embed-certs=true:将 ca.pem 和 admin.pem 证书内容嵌入到生成的 kubectl-proxy.kubeconfig 文件中(不加时,写入的是证书文件路径); 分发 kubeconfig 文件: source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kube-proxy.kubeconfig [email protected]${node_name}:/etc/kubernetes/ done 创建 kube-proxy 配置文件 从 v1.10 开始,kube-proxy 部分参数可以配置文件中配置。可以使用 --write-config-to 选项生成该配置文件,或者参考 kubeproxyconfig 的类型定义源文件 :https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/apis/kubeproxyconfig/types.go 创建 kube-proxy config 文件模板: cat >kube-proxy.config.yaml.template <<EOF apiVersion: kubeproxy.config.k8s.io/v1alpha1 bindAddress: ##NODE_IP## clientConnection: kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig clusterCIDR: ${CLUSTER_CIDR} healthzBindAddress: ##NODE_IP##:10256 hostnameOverride: ##NODE_NAME## kind: KubeProxyConfiguration metricsBindAddress: ##NODE_IP##:10249 mode: "iptables" EOF bindAddress: 监听地址; clientConnection.kubeconfig: 连接 apiserver 的 kubeconfig 文件; clusterCIDR: kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr 或 --masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT; hostnameOverride: 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 iptables 规则; mode: 使用 iptables 模式; 为各节点创建和分发 kube-proxy 配置文件: source /opt/k8s/bin/environment.sh for (( i=0; i < 3; i++ )) do echo ">>> ${NODE_NAMES[i]}" sed -e "s/##NODE_NAME##/${NODE_NAMES[i]}/" -e "s/##NODE_IP##/${NODE_IPS[i]}/" kube-proxy.config.yaml.template > kube-proxy-${NODE_NAMES[i]}.config.yaml scp kube-proxy-${NODE_NAMES[i]}.config.yaml [email protected]${NODE_NAMES[i]}:/etc/kubernetes/kube-proxy.config.yaml done 替换后的 kube-proxy.config.yaml 文件:kube-proxy.config.yaml 创建和分发 kube-proxy systemd unit 文件 source /opt/k8s/bin/environment.sh cat > kube-proxy.service <<EOF [Unit] Description=Kubernetes Kube-Proxy Server Documentation=https://github.com/GoogleCloudPlatform/kubernetes After=network.target [Service] WorkingDirectory=/var/lib/kube-proxy ExecStart=/opt/k8s/bin/kube-proxy \ --config=/etc/kubernetes/kube-proxy.config.yaml \ --alsologtostderr=true \ --logtostderr=false \ --log-dir=/var/log/kubernetes \ --v=2 Restart=on-failure RestartSec=5 LimitNOFILE=65536 [Install] WantedBy=multi-user.target EOF 替换后的 unit 文件:kube-proxy.service 分发 kube-proxy systemd unit 文件: source /opt/k8s/bin/environment.sh for node_name in ${NODE_NAMES[@]} do echo ">>> ${node_name}" scp kube-proxy.service [email protected]${node_name}:/etc/systemd/system/ done 启动 kube-proxy 服务 source /opt/k8s/bin/environment.sh for node_ip in ${NODE_IPS[@]} do echo ">>> ${node_ip}" ssh [email protected]${node_ip} "mkdir -p /var/lib/kube-proxy" ssh [email protected]${node_ip} "mkdir -p /var/log/kubernetes && chown -R k8s /var/log/kubernetes" ssh [email protected]${node_ip} "systemctl daemon-reload && systemctl enable kube-proxy && systemctl restart kube-proxy" done 必须先创建工作和日志目录;
原文地址:https://www.cnblogs.com/effortsing/p/10360489.html
时间: 2024-11-07 09:41:25