上一篇文章中kubernetes系列教程(七)深入玩轉(zhuǎn)pod調(diào)度介紹了kubernetes中Pod的調(diào)度機(jī)制,通過(guò)實(shí)戰(zhàn)演練介紹Pod調(diào)度到node的幾種方法:1. 通過(guò)nodeName固定選擇調(diào)度,2. 通過(guò)nodeSelector定向選擇調(diào)度,3. 通過(guò)node Affinity親和力調(diào)度,接下來(lái)介紹kubernetes系列教程pod的健康檢查機(jī)制。
在大武口等地區(qū),都構(gòu)建了全面的區(qū)域性戰(zhàn)略布局,加強(qiáng)發(fā)展的系統(tǒng)性、市場(chǎng)前瞻性、產(chǎn)品創(chuàng)新能力,以專注、極致的服務(wù)理念,為客戶提供成都網(wǎng)站設(shè)計(jì)、網(wǎng)站制作 網(wǎng)站設(shè)計(jì)制作按需搭建網(wǎng)站,公司網(wǎng)站建設(shè),企業(yè)網(wǎng)站建設(shè),品牌網(wǎng)站建設(shè),營(yíng)銷型網(wǎng)站建設(shè),成都外貿(mào)網(wǎng)站建設(shè),大武口網(wǎng)站建設(shè)費(fèi)用合理。
應(yīng)用在運(yùn)行過(guò)程中難免會(huì)出現(xiàn)錯(cuò)誤,如程序異常,軟件異常,硬件故障,網(wǎng)絡(luò)故障等,kubernetes提供Health Check健康檢查機(jī)制,當(dāng)發(fā)現(xiàn)應(yīng)用異常時(shí)會(huì)自動(dòng)重啟容器,將應(yīng)用從service服務(wù)中剔除,保障應(yīng)用的高可用性。k8s定義了三種探針Probe:
每種探測(cè)機(jī)制支持三種健康檢查方法,分別是命令行exec,httpGet和tcpSocket,其中exec通用性最強(qiáng),適用與大部分場(chǎng)景,tcpSocket適用于TCP業(yè)務(wù),httpGet適用于web業(yè)務(wù)。
每種探測(cè)方法能支持幾個(gè)相同的檢查參數(shù),用于設(shè)置控制檢查時(shí)間:
許多應(yīng)用程序運(yùn)行過(guò)程中無(wú)法檢測(cè)到內(nèi)部故障,如死鎖,出現(xiàn)故障時(shí)通過(guò)重啟業(yè)務(wù)可以恢復(fù),kubernetes提供liveness在線健康檢查機(jī)制,我們以exec為例,創(chuàng)建一個(gè)容器啟動(dòng)過(guò)程中創(chuàng)建一個(gè)文件/tmp/liveness-probe.log,10s后將其刪除,定義liveness健康檢查機(jī)制在容器中執(zhí)行命令ls -l /tmp/liveness-probe.log,通過(guò)文件的返回碼判斷健康狀態(tài),如果返回碼非0,暫停20s后kubelet會(huì)自動(dòng)將該容器重啟。
[root@node-1 demo]# cat centos-exec-liveness-probe.yaml
apiVersion: v1
kind: Pod
metadata:
name: exec-liveness-probe
annotations:
kubernetes.io/description: "exec-liveness-probe"
spec:
containers:
- name: exec-liveness-probe
image: centos:latest
imagePullPolicy: IfNotPresent
args: #容器啟動(dòng)命令,生命周期為30s
- /bin/sh
- -c
- touch /tmp/liveness-probe.log && sleep 10 && rm -f /tmp/liveness-probe.log && sleep 20
livenessProbe:
exec: #健康檢查機(jī)制,通過(guò)ls -l /tmp/liveness-probe.log返回碼判斷容器的健康狀態(tài)
command:
- ls
- l
- /tmp/liveness-probe.log
initialDelaySeconds: 1
periodSeconds: 5
timeoutSeconds: 1
[root@node-1 demo]# kubectl apply -f centos-exec-liveness-probe.yaml
pod/exec-liveness-probe created
[root@node-1 demo]# kubectl describe pods exec-liveness-probe | tail
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 28s default-scheduler Successfully assigned default/exec-liveness-probe to node-3
Normal Pulled 27s kubelet, node-3 Container image "centos:latest" already present on machine
Normal Created 27s kubelet, node-3 Created container exec-liveness-probe
Normal Started 27s kubelet, node-3 Started container exec-liveness-probe
#容器已啟動(dòng)
Warning Unhealthy 20s (x2 over 25s) kubelet, node-3 Liveness probe failed: /tmp/liveness-probe.log
ls: cannot access l: No such file or directory #執(zhí)行健康檢查,檢查異常
Warning Unhealthy 15s kubelet, node-3 Liveness probe failed: ls: cannot access l: No such file or directory
ls: cannot access /tmp/liveness-probe.log: No such file or directory
Normal Killing 15s kubelet, node-3 Container exec-liveness-probe failed liveness probe, will be restarted
#重啟容器
[root@node-1 demo]# kubectl get pods exec-liveness-probe
NAME READY STATUS RESTARTS AGE
exec-liveness-probe 1/1 Running 6 5m19s
[root@node-1 demo]# cat nginx-httpGet-liveness-readiness.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-httpget-livess-readiness-probe
annotations:
kubernetes.io/description: "nginx-httpGet-livess-readiness-probe"
spec:
containers:
- name: nginx-httpget-livess-readiness-probe
image: nginx:latest
ports:
- name: http-80-port
protocol: TCP
containerPort: 80
livenessProbe: #健康檢查機(jī)制,通過(guò)httpGet實(shí)現(xiàn)實(shí)現(xiàn)檢查
httpGet:
port: 80
scheme: HTTP
path: /index.html
initialDelaySeconds: 3
periodSeconds: 10
timeoutSeconds: 3
[root@node-1 demo]# kubectl apply -f nginx-httpGet-liveness-readiness.yaml
pod/nginx-httpget-livess-readiness-probe created
[root@node-1 demo]# kubectl get pods nginx-httpget-livess-readiness-probe
NAME READY STATUS RESTARTS AGE
nginx-httpget-livess-readiness-probe 1/1 Running 0 6s
查詢pod所屬的節(jié)點(diǎn)
[root@node-1 demo]# kubectl get pods nginx-httpget-livess-readiness-probe -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-httpget-livess-readiness-probe 1/1 Running 1 3m9s 10.244.2.19 node-3
登錄到pod中將文件刪除
[root@node-1 demo]# kubectl exec -it nginx-httpget-livess-readiness-probe /bin/bash
root@nginx-httpget-livess-readiness-probe:/# ls -l /usr/share/nginx/html/index.html
-rw-r--r-- 1 root root 612 Sep 24 14:49 /usr/share/nginx/html/index.html
root@nginx-httpget-livess-readiness-probe:/# rm -f /usr/share/nginx/html/index.html
[root@node-1 demo]# kubectl get pods nginx-httpget-livess-readiness-probe
NAME READY STATUS RESTARTS AGE
nginx-httpget-livess-readiness-probe 1/1 Running 1 4m22s
[root@node-1 demo]# kubectl describe pods nginx-httpget-livess-readiness-probe | tail
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m45s default-scheduler Successfully assigned default/nginx-httpget-livess-readiness-probe to node-3
Normal Pulling 3m29s (x2 over 5m45s) kubelet, node-3 Pulling image "nginx:latest"
Warning Unhealthy 3m29s (x3 over 3m49s) kubelet, node-3 Liveness probe failed: HTTP probe failed with statuscode: 404
Normal Killing 3m29s kubelet, node-3 Container nginx-httpget-livess-readiness-probe failed liveness probe, will be restarted
Normal Pulled 3m25s (x2 over 5m41s) kubelet, node-3 Successfully pulled image "nginx:latest"
Normal Created 3m25s (x2 over 5m40s) kubelet, node-3 Created container nginx-httpget-livess-readiness-probe
Normal Started 3m25s (x2 over 5m40s) kubelet, node-3 Started container nginx-httpget-livess-readiness-probe
[root@node-1 demo]# cat nginx-tcp-liveness.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-tcp-liveness-probe
annotations:
kubernetes.io/description: "nginx-tcp-liveness-probe"
spec:
containers:
- name: nginx-tcp-liveness-probe
image: nginx:latest
ports:
- name: http-80-port
protocol: TCP
containerPort: 80
livenessProbe: #健康檢查為tcpSocket,探測(cè)TCP 80端口
tcpSocket:
port: 80
initialDelaySeconds: 3
periodSeconds: 10
timeoutSeconds: 3
[root@node-1 demo]# kubectl apply -f nginx-tcp-liveness.yaml
pod/nginx-tcp-liveness-probe created
[root@node-1 demo]# kubectl get pods nginx-tcp-liveness-probe
NAME READY STATUS RESTARTS AGE
nginx-tcp-liveness-probe 1/1 Running 0 6s
獲取pod所在node
[root@node-1 demo]# kubectl get pods nginx-tcp-liveness-probe -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-tcp-liveness-probe 1/1 Running 0 99s 10.244.2.20 node-3
登錄到pod中
[root@node-1 demo]# kubectl exec -it nginx-httpget-livess-readiness-probe /bin/bash
#執(zhí)行apt-get update更新和apt-get install htop安裝工具
root@nginx-httpget-livess-readiness-probe:/# apt-get update
Get:1 http://cdn-fastly.deb.debian.org/debian buster InRelease [122 kB]
Get:2 http://security-cdn.debian.org/debian-security buster/updates InRelease [39.1 kB]
Get:3 http://cdn-fastly.deb.debian.org/debian buster-updates InRelease [49.3 kB]
Get:4 http://security-cdn.debian.org/debian-security buster/updates/main amd64 Packages [95.7 kB]
Get:5 http://cdn-fastly.deb.debian.org/debian buster/main amd64 Packages [7899 kB]
Get:6 http://cdn-fastly.deb.debian.org/debian buster-updates/main amd64 Packages [5792 B]
Fetched 8210 kB in 3s (3094 kB/s)
Reading package lists... Done
root@nginx-httpget-livess-readiness-probe:/# apt-get install htop
Reading package lists... Done
Building dependency tree
Reading state information... Done
Suggested packages:
lsof strace
The following NEW packages will be installed:
htop
0 upgraded, 1 newly installed, 0 to remove and 5 not upgraded.
Need to get 92.8 kB of archives.
After this operation, 230 kB of additional disk space will be used.
Get:1 http://cdn-fastly.deb.debian.org/debian buster/main amd64 htop amd64 2.2.0-1+b1 [92.8 kB]
Fetched 92.8 kB in 0s (221 kB/s)
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package htop.
(Reading database ... 7203 files and directories currently installed.)
Preparing to unpack .../htop_2.2.0-1+b1_amd64.deb ...
Unpacking htop (2.2.0-1+b1) ...
Setting up htop (2.2.0-1+b1) ...
root@nginx-httpget-livess-readiness-probe:/# kill 1
root@nginx-httpget-livess-readiness-probe:/# command terminated with exit code 137
查看pod情況
[root@node-1 demo]# kubectl get pods nginx-tcp-liveness-probe
NAME READY STATUS RESTARTS AGE
nginx-tcp-liveness-probe 1/1 Running 1 13m
[root@node-1 demo]# kubectl describe pods nginx-tcp-liveness-probe | tail
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned default/nginx-tcp-liveness-probe to node-3
Normal Pulling 44s (x2 over 14m) kubelet, node-3 Pulling image "nginx:latest"
Normal Pulled 40s (x2 over 14m) kubelet, node-3 Successfully pulled image "nginx:latest"
Normal Created 40s (x2 over 14m) kubelet, node-3 Created container nginx-tcp-liveness-probe
Normal Started 40s (x2 over 14m) kubelet, node-3 Started container nginx-tcp-liveness-probe
就緒檢查用于應(yīng)用接入到service的場(chǎng)景,用于判斷應(yīng)用是否已經(jīng)就緒完畢,即是否可以接受外部轉(zhuǎn)發(fā)的流量,健康檢查正常則將pod加入到service的endpoints中,健康檢查異常則從service的endpoints中刪除,避免影響業(yè)務(wù)的訪問(wèn)。
[root@node-1 demo]# cat httpget-liveness-readiness-probe.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-tcp-liveness-probe
annotations:
kubernetes.io/description: "nginx-tcp-liveness-probe"
labels: #需要定義labels,后面定義的service需要調(diào)用
app: nginx
spec:
containers:
- name: nginx-tcp-liveness-probe
image: nginx:latest
ports:
- name: http-80-port
protocol: TCP
containerPort: 80
livenessProbe: #存活檢查探針
httpGet:
port: 80
path: /index.html
scheme: HTTP
initialDelaySeconds: 3
periodSeconds: 10
timeoutSeconds: 3
readinessProbe: #就緒檢查探針
httpGet:
port: 80
path: /test.html
scheme: HTTP
initialDelaySeconds: 3
periodSeconds: 10
timeoutSeconds: 3
[root@node-1 demo]# cat nginx-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: nginx
name: nginx-service
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx
type: ClusterIP
[root@node-1 demo]# kubectl apply -f httpget-liveness-readiness-probe.yaml
pod/nginx-tcp-liveness-probe created
[root@node-1 demo]# kubectl apply -f nginx-service.yaml
service/nginx-service created
[root@node-1 ~]# kubectl get pods nginx-httpget-livess-readiness-probe
NAME READY STATUS RESTARTS AGE
nginx-httpget-livess-readiness-probe 1/1 Running 2 153m
#readiness健康檢查異常,404報(bào)錯(cuò)(最后一行)
[root@node-1 demo]# kubectl describe pods nginx-tcp-liveness-probe | tail
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m6s default-scheduler Successfully assigned default/nginx-tcp-liveness-probe to node-3
Normal Pulling 2m5s kubelet, node-3 Pulling image "nginx:latest"
Normal Pulled 2m1s kubelet, node-3 Successfully pulled image "nginx:latest"
Normal Created 2m1s kubelet, node-3 Created container nginx-tcp-liveness-probe
Normal Started 2m1s kubelet, node-3 Started container nginx-tcp-liveness-probe
Warning Unhealthy 2s (x12 over 112s) kubelet, node-3 Readiness probe failed: HTTP probe failed with statuscode: 404
[root@node-1 ~]# kubectl describe services nginx-service
Name: nginx-service
Namespace: default
Labels: app=nginx
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"nginx"},"name":"nginx-service","namespace":"default"},"s...
Selector: app=nginx
Type: ClusterIP
IP: 10.110.54.40
Port: http 80/TCP
TargetPort: 80/TCP
Endpoints: #Endpoints對(duì)象為空
Session Affinity: None
Events:
#endpoints狀態(tài)
[root@node-1 demo]# kubectl describe endpoints nginx-service
Name: nginx-service
Namespace: default
Labels: app=nginx
Annotations: endpoints.kubernetes.io/last-change-trigger-time: 2019-09-30T14:27:37Z
Subsets:
Addresses:
NotReadyAddresses: 10.244.2.22 #pod處于NotReady狀態(tài)
Ports:
Name Port Protocol
---- ---- --------
http 80 TCP
Events:
[root@node-1 ~]# kubectl exec -it nginx-httpget-livess-readiness-probe /bin/bash
root@nginx-httpget-livess-readiness-probe:/# echo "readiness probe demo" >/usr/share/nginx/html/test.html
健康檢查正常
[root@node-1 demo]# curl http://10.244.2.22/test.html
查看endpoints情況
readines[root@node-1 demo]# kubectl describe endpoints nginx-service
Name: nginx-service
Namespace: default
Labels: app=nginx
Annotations: endpoints.kubernetes.io/last-change-trigger-time: 2019-09-30T14:33:01Z
Subsets:
Addresses: 10.244.2.22 #就緒地址,已從NotReady中提出,加入到正常的Address列表中
NotReadyAddresses:
Ports:
Name Port Protocol
---- ---- --------
http 80 TCP
查看service狀態(tài)
[root@node-1 demo]# kubectl describe services nginx-service
Name: nginx-service
Namespace: default
Labels: app=nginx
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"nginx"},"name":"nginx-service","namespace":"default"},"s...
Selector: app=nginx
Type: ClusterIP
IP: 10.110.54.40
Port: http 80/TCP
TargetPort: 80/TCP
Endpoints: 10.244.2.22:80 #已和endpoints關(guān)聯(lián)
Session Affinity: None
Events:
刪除站點(diǎn)信息,使健康檢查異常
[root@node-1 demo]# kubectl exec -it nginx-tcp-liveness-probe /bin/bash
root@nginx-tcp-liveness-probe:/# rm -f /usr/share/nginx/html/test.html
查看pod健康檢查event日志
[root@node-1 demo]# kubectl get pods nginx-tcp-liveness-probe
NAME READY STATUS RESTARTS AGE
nginx-tcp-liveness-probe 0/1 Running 0 11m
[root@node-1 demo]# kubectl describe pods nginx-tcp-liveness-probe | tail
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12m default-scheduler Successfully assigned default/nginx-tcp-liveness-probe to node-3
Normal Pulling 12m kubelet, node-3 Pulling image "nginx:latest"
Normal Pulled 11m kubelet, node-3 Successfully pulled image "nginx:latest"
Normal Created 11m kubelet, node-3 Created container nginx-tcp-liveness-probe
Normal Started 11m kubelet, node-3 Started container nginx-tcp-liveness-probe
Warning Unhealthy 119s (x32 over 11m) kubelet, node-3 Readiness probe failed: HTTP probe failed with statuscode: 404
查看endpoints
[root@node-1 demo]# kubectl describe endpoints nginx-service
Name: nginx-service
Namespace: default
Labels: app=nginx
Annotations: endpoints.kubernetes.io/last-change-trigger-time: 2019-09-30T14:38:01Z
Subsets:
Addresses:
NotReadyAddresses: 10.244.2.22 #健康檢查異常,此時(shí)加入到NotReady狀態(tài)
Ports:
Name Port Protocol
---- ---- --------
http 80 TCP
Events:
查看service狀態(tài),此時(shí)endpoints為空
[root@node-1 demo]# kubectl describe services nginx-service
Name: nginx-service
Namespace: default
Labels: app=nginx
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"nginx"},"name":"nginx-service","namespace":"default"},"s...
Selector: app=nginx
Type: ClusterIP
IP: 10.110.54.40
Port: http 80/TCP
TargetPort: 80/TCP
Endpoints: #為空
Session Affinity: None
Events:
TKE中可以設(shè)定應(yīng)用的健康檢查機(jī)制,健康檢查機(jī)制包含在不同的Workload中,可以通過(guò)模板生成健康監(jiān)測(cè)機(jī)制,定義過(guò)程中可以選擇高級(jí)選項(xiàng),默認(rèn)健康檢查機(jī)制是關(guān)閉狀態(tài),包含前面介紹的兩種探針:存活探針livenessProbe和就緒探針readinessProbe,根據(jù)需要分別開啟
開啟探針之后進(jìn)入設(shè)置健康檢查,支持上述介紹的三種方法:執(zhí)行命令檢查、TCP端口檢查,HTTP請(qǐng)求檢查
選擇不同的檢查方法填寫不同的參數(shù)即可,如啟動(dòng)間隔,檢查間隔,響應(yīng)超時(shí),等參數(shù),以HTTP請(qǐng)求檢查方法為例:
設(shè)置完成后創(chuàng)建workload時(shí)候會(huì)自動(dòng)生成yaml文件,以剛創(chuàng)建的deployment為例,生成健康檢查yaml文件內(nèi)容如下:
apiVersion: apps/v1beta2
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
description: tke-health-check-demo
creationTimestamp: "2019-09-30T12:28:42Z"
generation: 1
labels:
k8s-app: tke-health-check-demo
qcloud-app: tke-health-check-demo
name: tke-health-check-demo
namespace: default
resourceVersion: "2060365354"
selfLink: /apis/apps/v1beta2/namespaces/default/deployments/tke-health-check-demo
uid: d6cf1f25-e37d-11e9-87fd-567eb17a3218
spec:
minReadySeconds: 10
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: tke-health-check-demo
qcloud-app: tke-health-check-demo
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
k8s-app: tke-health-check-demo
qcloud-app: tke-health-check-demo
spec:
containers:
- image: nginx:latest
imagePullPolicy: Always
livenessProbe: #通過(guò)模板生成的健康檢查機(jī)制
failureThreshold: 1
httpGet:
path: /
port: 80
scheme: HTTP
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 2
name: tke-health-check-demo
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 250m
memory: 256Mi
securityContext:
privileged: false
procMount: Default
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
DNSPolicy: ClusterFirst
imagePullSecrets:
- name: qcloudregistrykey
- name: tencenthubkey
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
本章介紹kubernetes中健康檢查兩種Probe:livenessProbe和readinessProbe,livenessProbe主要用于存活檢查,檢查容器內(nèi)部運(yùn)行狀態(tài),readiness主要用于就緒檢查,是否可以接受流量,通常需要和service的endpoints結(jié)合,當(dāng)就緒準(zhǔn)備妥當(dāng)時(shí)加入到endpoints中,當(dāng)就緒異常時(shí)從endpoints中刪除,從而實(shí)現(xiàn)了services的健康檢查和服務(wù)探測(cè)機(jī)制。對(duì)于Probe機(jī)制提供了三種檢測(cè)的方法,分別適用于不同的場(chǎng)景:1. exec命令行,通過(guò)命令或shell實(shí)現(xiàn)健康檢查,2. tcpSocket通過(guò)TCP協(xié)議探測(cè)端口,建立tcp連接,3. httpGet通過(guò)建立http請(qǐng)求探測(cè),讀者可多實(shí)操掌握其用法。
健康檢查:https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
TKE健康檢查設(shè)置方法:https://cloud.tencent.com/document/product/457/32815
當(dāng)你的才華撐不起你的野心時(shí),你就應(yīng)該靜下心來(lái)學(xué)習(xí)
返回kubernetes系列教程目錄
**如果覺(jué)得文章對(duì)您有幫助,請(qǐng)訂閱專欄,分享給有需要的朋友吧