k8s的startupProbe探針中initialDelaySeconds是否有效
衆所周知,k8s目前有三種探針,官方文檔的英文解釋如下:
The kubelet uses liveness probes to know when to restart a container. For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs.
The kubelet uses readiness probes to know when a container is ready to start accepting traffic. A Pod is considered ready when all of its containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
The kubelet uses startup probes to know when a container application has started. If such a probe is configured, it disables liveness and readiness checks until it succeeds, making sure those probes don't interfere with the application startup. This can be used to adopt liveness checks on slow starting containers, avoiding them getting killed by the kubelet before they are up and running.
上面的英文概述很清楚,liveness probes存活探針決定什麼時候重啓容器,live英文字面理解也相符,不通過liveness檢測,容器就會被重啓。readiness probes 就是傳統負載均衡中的概念,很好理解,檢測通過就加入服務池,熟悉nginx的都可以類比 upstream 中配置的健康檢測,行就上,不行就下,容器還在。最後是新加的 startup probes 啓動探針,我理解的使用場景是,慢啓動的容器,在啓動開始後,運行了一段時間,期間並不想讓另外兩種探針來處理,等這個啓動探針通過後,再由另外兩個探針的邏輯接管,這樣的好處是多了一道關卡,保護慢啓動的容器,避免還沒準備好的時候,額外檢測操作來導致不可預知的問題。
有了上述基礎背景後,initialDelaySeconds這個參數就出場了,同樣給出官方文檔解釋
initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to 0 seconds. Minimum value is 0.
當你細讀上面的文字的時候,是否會有疑問,initialDelaySeconds參數到底對startup probes有效嗎?我搜了網上的各種信息,很遺憾沒有人確切說明,在官方的另外一個文檔中
initialDelaySeconds integer Number of seconds after the container has started before liveness probes are initiated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes
有沒有搞錯,連readiness probes的解釋都不見了,怎麼辦呢,靠別人不行,只能自己加倍努力啦,我的方法是翻源碼初步確認,再做實驗驗證
1.去github搜索k8s源碼,探針爲一個golang結構體
1 type Probe struct {
2 // The action taken to determine the health of a container
3 ProbeHandler
4 // Length of time before health checking is activated. In seconds.
5 // +optional
6 InitialDelaySeconds int32
7 // Length of time before health checking times out. In seconds.
8 // +optional
9 TimeoutSeconds int32
10 // How often (in seconds) to perform the probe.
11 // +optional
12 PeriodSeconds int32
13 // Minimum consecutive successes for the probe to be considered successful after having failed.
14 // Must be 1 for liveness and startup.
15 // +optional
16 SuccessThreshold int32
17 // Minimum consecutive failures for the probe to be considered failed after having succeeded.
18 // +optional
19 FailureThreshold int32
20 // Optional duration in seconds the pod needs to terminate gracefully upon probe failure.
21 // The grace period is the duration in seconds after the processes running in the pod are sent
22 // a termination signal and the time when the processes are forcibly halted with a kill signal.
23 // Set this value longer than the expected cleanup time for your process.
24 // If this value is nil, the pod's terminationGracePeriodSeconds will be used. Otherwise, this
25 // value overrides the value provided by the pod spec.
26 // Value must be non-negative integer. The value zero indicates stop immediately via
27 // the kill signal (no opportunity to shut down).
28 // This is a beta field and requires enabling ProbeTerminationGracePeriod feature gate.
29 // +optional
30 TerminationGracePeriodSeconds *int64
31 }
上面並沒有說startup probes不能使用InitialDelaySeconds參數
2.做實驗,愛迪生也是這麼做的,沒毛病
1 apiVersion: v1
2 kind: Pod
3 metadata:
4 labels:
5 test: liveness
6 name: liveness-exec
7 spec:
8 containers:
9 - name: liveness
10 image: k8s.gcr.io/busybox
11 args:
12 - /bin/sh
13 - -c
14 - touch /tmp/healthy; sleep 30; rm -f /tmp/healthy; sleep 600
15 resources:
16 limits:
17 cpu: 1000m
18 memory: 2G
19 requests:
20 cpu: 1000m
21 memory: 2G
22 livenessProbe:
23 exec:
24 command:
25 - cat
26 - /tmp/healthy
27 initialDelaySeconds: 5
28 periodSeconds: 5
29
30 startupProbe:
31 exec:
32 command:
33 - cat
34 - /tmp/helloworld # 這個文件肯定不存在的
35 failureThreshold: 3
36 periodSeconds: 10
37 initialDelaySeconds: 8 # 測試的就是這個參數
保存上述文件爲exec-liveness.yaml,啓動pod測試
1 kubectl -n xxxx-test apply -f exec-liveness.yaml
終端裏面反覆查看這個,在輸出結果的最下面,有信息顯示啓動探測器失敗了
1 kubectl -n xxxx-test describe pod liveness-exec
注意看那個age欄的9s,非常接近8s對不對
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned xxxx-test/liveness-exec to 10.132.135.31
Normal Pulled 9s kubelet, 10.132.135.31 Container image "xxx/docker/busybox:1.24" already present on machine
Normal Created 9s kubelet, 10.132.135.31 Created container liveness
Normal Started 9s kubelet, 10.132.135.31 Started container liveness
Warning Unhealthy 0s kubelet, 10.132.135.31 Startup probe failed: cat: can't open '/tmp/helloworld': No such file or directory
修改改成initialDelaySeconds: 15,刪除容器重新查看,會發現那個地方變成16s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned xxxx-test/liveness-exec to 10.132.133.16
Normal Pulling 16s kubelet, 10.132.133.16 Pulling image "xxx/docker/busybox:1.24"
Normal Pulled 16s kubelet, 10.132.133.16 Successfully pulled image "xxxx/docker/busybox:1.24"
Normal Created 16s kubelet, 10.132.133.16 Created container liveness
Normal Started 16s kubelet, 10.132.133.16 Started container liveness
Warning Unhealthy 0s kubelet, 10.132.133.16 Startup probe failed: cat: can't open '/tmp/helloworld': No such file or directory
說明了什麼?那就是InitialDelaySeconds參數在目前的版本,對startup probes是有效的,和源碼裏面的註釋相符,很遺憾,官方文檔多少有些誤導大家,最後希望大家合理使用這些探針,及時發現問題,高枕無憂
版權申明:
- 未標註來源的內容皆為原創,未經授權請勿轉載(因轉載後排版往往錯亂、內容不可控、無法持續更新等);
- 非營利為目的,演繹本博客任何內容,請以'原文出處'或者'參考鏈接'等方式給出本站相關網頁地址(方便讀者)。
相關文章:
- zfs快照功能測試
- Linux和iPhone互傳文件
- Nginx webdav for Joplin
- Ubuntu 安裝LDAP客戶端
- lvs端部署
- Ubuntu notify-send 定時通知
- alacritty 終端使用
- 搭建樹莓派無線路由器