1.两种探针
readiness probe(就绪探针)
监测容器是否就绪?只有pod里的容器就绪,kubelet才会认为pod处于就绪状态.
就绪探针的作用是控制哪些pod可以作为svc的后端,如果pod不是就绪状态,就把它从svc load balancer中移除.
liveness probe(存活探针)
监测容器是否存活?如果容器中的应用出现问题,liveness将检测到容器不健康会通知kubelet,kubelet重启该pod容器.
2.使用探针的三种方式
官网介绍了三种,见下:
command命令执行
http request访问
tcp socket连接
个人比较喜欢用第三种方式,tcp socket.
3.tcp socket方式学习测试
tcp socket方式
这个方式比较好理解.
比如说,起一个nginx容器,nginx服务提供的端口是80端口.
配置tcp socket 探针,设定隔一个时间,用tcp方式连接80端口,如果连接成功,就返回容器健康或者就绪,如果连接失败,返回容器不健康或者不就绪,kubelet重启容器.
逆向思维示例:
简单思路:探针tcp socket连接不存在的8080端口,必然连接失败报错,从而实现pod不断重启.
[[email protected] k8s_tanzhen]# cat tcp_ness
apiVersion: v1
kind: Pod
metadata:
? name: httpd
? labels:
? ? app: httpd
spec:
? containers:
? - name: httpd
? ? image: nginx
? ? ports:
? ? - containerPort: 80
? ? readinessProbe:
? ? ? tcpSocket:
? ? ? ? port: 8080
? ? ? initialDelaySeconds: 45
? ? ? periodSeconds: 20
? ? livenessProbe:
? ? ? tcpSocket:
? ? ? ? port: 8080
? ? ? initialDelaySeconds: 45
? ? ? periodSeconds: 20
[[email protected] k8s_tanzhen]#
起一个nginx的pod容器,提供服务端口80.
配置探针连接端口8080,第一次监测时间为pod容器启动后的45s,第一次监测后每隔20s监测一次.
测试结果,pod容器一直在重启.
[[email protected] k8s_tanzhen]# kubectl get pod -o wide
NAME? ? ? ? ? ? ? ? ? ? ? ? READY? ? STATUS? ? ? ? ? ? RESTARTS? AGE? ? ? IP? ? ? ? ? ? NODE
httpd? ? ? ? ? ? ? ? ? ? ? 0/1? ? ? CrashLoopBackOff? 7? ? ? ? ? 18m? ? ? 172.30.35.3? k8s-master3
describe报错
?Warning? Unhealthy? 6m (x19 over 16m)? kubelet, k8s-master3? Liveness probe failed: dial tcp 172.30.35.3:8080: connect: connection refused
?Warning? Unhealthy? 2m (x15 over 16m)? kubelet, k8s-master3? Readiness probe failed: dial tcp 172.30.35.3:8080: connect: connection refused
探针自动tcp连接容器ip:8080端口,失败.所以容器一直重启.
正常配置示例
正常配置是连接提供服务的80端口
简单思路:理论上来说,长时间运行的应用程序最终会过渡到中断状态,除非重新启动,否则无法恢复.Kubernetes提供了活性探针来检测和补救这种情况.这是配置探针的根本原因,以防万一.
[[email protected] k8s_tanzhen]# cat tcp_ness
apiVersion: v1
kind: Pod
metadata:
? name: httpd
? labels:
? ? app: httpd
spec:
? containers:
? - name: httpd
? ? image: nginx
? ? ports:
? ? - containerPort: 80
? ? readinessProbe:
? ? ? tcpSocket:
? ? ? ? port: 80
? ? ? initialDelaySeconds: 45
? ? ? periodSeconds: 20
? ? livenessProbe:
? ? ? tcpSocket:
? ? ? ? port: 80
? ? ? initialDelaySeconds: 45
? ? ? periodSeconds: 20
[[email protected] k8s_tanzhen]# kubectl get pod -o wide
NAME? ? ? ? ? ? ? ? ? ? ? ? READY? ? STATUS? ? RESTARTS? AGE? ? ? IP? ? ? ? ? ? NODE
httpd? ? ? ? ? ? ? ? ? ? ? 1/1? ? ? Running? 0? ? ? ? ? 2m? ? ? ? 172.30.35.3? k8s-master3
正常配置模拟测试案例
简单思路:起nginx容器,然后执行命令杀死nginx进程,设定探针监测连接tcp socket 80端口,当nginx进程被杀死后,tcp socket连接失败,探针监测容器为不健康不就绪,kubelet重启容器.
[[email protected] k8s_tanzhen]# cat tcp_ness
apiVersion: v1
kind: Pod
metadata:
? name: httpd
? labels:
? ? app: httpd
spec:
? containers:
? - name: httpd
? ? image: nginx
? ? args:
? ? - /bin/sh
? ? - -c
? ? - sleep 60;nginx -s quit
? ? ports:
? ? - containerPort: 80
? ? readinessProbe:
? ? ? tcpSocket:
? ? ? ? port: 80
? ? ? initialDelaySeconds: 20
? ? ? periodSeconds: 10
? ? livenessProbe:
? ? ? tcpSocket:
? ? ? ? port: 80
? ? ? initialDelaySeconds: 20
? ? ? periodSeconds: 10
[[email protected] k8s_tanzhen]#
配置参数说明:
容器启动后,执行nginx -s quit杀死Nginx进程
容器启动20s后开始执行readiness和liveness检测
容器启动后35s左右
探针监测到nginx进程已经死掉,无法连接到80端口,报警见下:
? Warning? Unhealthy? 8s (x3 over 28s)? kubelet, k8s-master3? Liveness probe failed: dial tcp 172.30.35.3:80: connect: connection refused
? Warning? Unhealthy? 7s (x3 over 27s)? kubelet, k8s-master3? Readiness probe failed: dial tcp 172.30.35.3:80: connect: connection refused
整个重启事件记录
Events:
? Type? ? Reason? ? Age? ? ? ? ? ? ? From? ? ? ? ? ? ? ? ? Message
? ----? ? ------? ? ----? ? ? ? ? ? ? ----? ? ? ? ? ? ? ? ? -------
? Normal? Scheduled? 2m? ? ? ? ? ? ? ? default-scheduler? ? Successfully assigned default/httpd to k8s-master3
? Normal? Pulled? ? 1m (x2 over 2m)? kubelet, k8s-master3? Successfully pulled image "nginx"
? Normal? Created? ? 1m (x2 over 2m)? kubelet, k8s-master3? Created container
? Normal? Started? ? 1m (x2 over 2m)? kubelet, k8s-master3? Started container
? Warning? Unhealthy? 16s (x6 over 1m)? kubelet, k8s-master3? Liveness probe failed: dial tcp 172.30.35.3:80: connect: connection refed
? Warning? Unhealthy? 15s (x8 over 1m)? kubelet, k8s-master3? Readiness probe failed: dial tcp 172.30.35.3:80: connect: connection resed
? Normal? Pulling? ? 5s (x3 over 2m)? kubelet, k8s-master3? pulling image "nginx"
? Normal? Killing? ? 5s (x2 over 1m)? kubelet, k8s-master3? Killing container with id docker://httpd:Container failed liveness prob. Container will be killed and recreated.
可以看到,nginx进程杀死后,pod自动重启.
[[email protected] k8s_tanzhen]# kubectl get pod -o wide
NAME? ? ? ? ? ? ? ? ? ? ? ? READY? ? STATUS? ? RESTARTS? AGE? ? ? IP? ? ? ? ? ? NODE
httpd? ? ? ? ? ? ? ? ? ? ? 0/1? ? ? Running? 4? ? ? ? ? 5m? ? ? ? 172.30.35.3? k8s-master3
实现测试目的
原文地址:https://blog.51cto.com/goome/2422823