Nodes资源在k8s中是集群级别的资源

1
2
3
4
5
6
7
# 获取集群中的Nodes信息
root@k8s-master01:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master01 Ready control-plane,master 38h v1.21.1
k8s-node01 Ready <none> 38h v1.21.1
k8s-node02 Ready <none> 38h v1.21.1
k8s-node03 Ready <none> 38h v1.21.1

查看指定node节点的详细信息

1
2
3
root@k8s-master01:~# kubectl get nodes k8s-node01 -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-node01 Ready <none> 38h v1.21.1 172.16.11.81 <none> Ubuntu 20.04.2 LTS 5.4.0-74-generic docker://20.10.7

查看node节点的yaml信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
root@k8s-master01:~# kubectl get nodes k8s-node01 -o yaml | kubectl-neat
apiVersion: v1
kind: Node
metadata:
annotations:
csi.volume.kubernetes.io/nodeid: '{"driver.longhorn.io":"k8s-node01"}'
flannel.alpha.coreos.com/backend-data: '{"VNI":1,"VtepMAC":"7e:c5:0b:9e:6a:5c"}'
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: "true"
flannel.alpha.coreos.com/public-ip: 172.16.11.81
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: "0"
volumes.kubernetes.io/controller-managed-attach-detach: "true"
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: k8s-node01
kubernetes.io/os: linux
name: k8s-node01
spec:
podCIDR: 10.244.1.0/24
podCIDRs:
- 10.244.1.0/24

需要注意的是节点虽然能够以yaml的资源清单创建,但是意义不大。

正常情况下一个节点的健康与否是被节点控制器所管控,节点控制器主要是负责节点生命周期中的多个管理任务。

k8s上的每一个资源类型,必须由相应的资源控制器来进行管控。否则我们所创建的数据仅仅是存放在apiserver中的数据项,而不会真正的发挥做用。

一个控制器有可能控制多个资源类型,所以二者并不是一一对应的,而节点控制器是专门给用来负责节点资源的控制器,它负责节点生命周期中的多个管理任务,如节点注册到集群时分配子网,与服务器列表交互以维护节点的可用信息,监控节点的健康状态,一旦发现节点不健康则拒绝调度器调度pod到节点上。因此对于每一个节点对象,节点控制器会更具其元数据字段中的name执行健康状态检测,以验证节点是否可用。

在节点上kubelet是运行于节点上的代理程序,它负责从apiserver接受并执行由自身所承载的相关管理任务。并向master上报自身的运行状态,以维持集群的正常运行。

节点租约

在1.13前版本kubelet会每隔10秒向master发送心跳信息,以及上报节点上的node status的信息,当master经过4个周期的时间没有收到节点的心跳信息则认为节点宕机,会将该节点上的所有pod迁移到其他节点运行起来。当节点增多时,这种频繁的发送心跳信息,以及节点上大量的node status信息的发送会对master造成压力,所以在新的版本中kubernetes引入了节点租约的概念。

节点租约的相关信息存放在kuber-node-lease名称空间中。

1
2
3
4
5
6
7
# 显示节点租约
root@k8s-master01:~# kubectl get lease -n kube-node-lease
NAME HOLDER AGE
k8s-master01 k8s-master01 39h
k8s-node01 k8s-node01 39h
k8s-node02 k8s-node02 39h
k8s-node03 k8s-node03 39h

节点租约和kubelet协同工作

正常情况下kubelet依然是每10秒定期跟新节点租约内的数据,而且kubelet定期(10s)计算下node status但不上报,仅在node status发生变动时才上报或者超过了5分钟才会上报。

Node Status信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
root@k8s-master01:~# kubectl get nodes k8s-node01 -o yaml
apiVersion: v1
kind: Node
metadata:
annotations:
csi.volume.kubernetes.io/nodeid: '{"driver.longhorn.io":"k8s-node01"}'
flannel.alpha.coreos.com/backend-data: '{"VNI":1,"VtepMAC":"7e:c5:0b:9e:6a:5c"}'
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: "true"
flannel.alpha.coreos.com/public-ip: 172.16.11.81
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: "0"
volumes.kubernetes.io/controller-managed-attach-detach: "true"
creationTimestamp: "2021-06-28T11:08:03Z"
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
kubernetes.io/arch: amd64
kubernetes.io/hostname: k8s-node01
kubernetes.io/os: linux
name: k8s-node01
resourceVersion: "295806"
uid: 0e86e892-f859-49dd-83fb-e030fd955026
spec:
podCIDR: 10.244.1.0/24
podCIDRs:
- 10.244.1.0/24
status:
addresses:
- address: 172.16.11.81 # 节点地址
type: InternalIP # 类型为内部通信使用
- address: k8s-node01 # 主机名称
type: Hostname
allocatable: # 可分配的
cpu: "6" # 可分配CPU
ephemeral-storage: "188319986991" # 本地磁盘可用大小
hugepages-1Gi: "0" # 大内存页1G,可用为0
hugepages-2Mi: "0" # 大内存页2M,可用为0
memory: 8049876Ki # 可用内存总数
pods: "110" # 节点最大可运行pods,此值可以调整。
capacity: # 资源容量
cpu: "6"
ephemeral-storage: 204340264Ki
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 8152276Ki
pods: "110"
conditions: # 自身所处的境况,每个条件需要使用bool值来判断是否满足境况。
- lastHeartbeatTime: "2021-06-28T11:08:48Z"
lastTransitionTime: "2021-06-28T11:08:48Z"
message: Flannel is running on this node # 当前信息
reason: FlannelIsUp
status: "False" # false表示没有发生
type: NetworkUnavailable # 网络不可用是否发生
- lastHeartbeatTime: "2021-06-30T02:47:23Z"
lastTransitionTime: "2021-06-28T11:08:03Z"
message: kubelet has sufficient memory available # kubelet有充足的内存资源
reason: KubeletHasSufficientMemory
status: "False" # 不紧张
type: MemoryPressure # 内存资源紧张与否
- lastHeartbeatTime: "2021-06-30T02:47:23Z"
lastTransitionTime: "2021-06-28T11:08:03Z"
message: kubelet has no disk pressure
reason: KubeletHasNoDiskPressure
status: "False"
type: DiskPressure # 磁盘是否满了
- lastHeartbeatTime: "2021-06-30T02:47:23Z"
lastTransitionTime: "2021-06-28T11:08:03Z"
message: kubelet has sufficient PID available
reason: KubeletHasSufficientPID
status: "False"
type: PIDPressure # PID是否满了
- lastHeartbeatTime: "2021-06-30T02:47:23Z"
lastTransitionTime: "2021-06-28T11:08:53Z"
message: kubelet is posting ready status. AppArmor enabled
reason: KubeletReady # kubelet自身是否ready
status: "True"
type: Ready
daemonEndpoints:
kubeletEndpoint:
Port: 10250
images:
- names:
- longhornio/longhorn-engine@sha256:e484ec73816c9620615d65ec61c75b07dadb3b3f6d3cb69a3d2d863df4f003d7
- longhornio/longhorn-engine:v1.1.1
sizeBytes: 315055800
- names:
- longhornio/longhorn-instance-manager@sha256:7d867436d6b5597a31c0dd958f642e970b9921fbe0d0fb6af9ba0a2233b33ae3
- longhornio/longhorn-instance-manager:v1_20201216
sizeBytes: 289445107
- names:
- longhornio/longhorn-manager@sha256:ede61fe2a472099e8fef562bab36c27add9430a6adf6402424f76e2865a5b0b7
- longhornio/longhorn-manager:v1.1.1
sizeBytes: 277611575
- names:
- registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy@sha256:53af05c2a6cddd32cebf5856f71994f5d41ef2a62824b87f140f2087f91e4a38
- registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.21.1
sizeBytes: 130788187
- names:
- ikubernetes/demoapp@sha256:6698b205eb18fb0171398927f3a35fe27676c6bf5757ef57a35a4b055badf2c3
- ikubernetes/demoapp:v1.0
sizeBytes: 92665704
- names:
- quay.io/coreos/flannel@sha256:4a330b2f2e74046e493b2edc30d61fdebbdddaaedcb32d62736f25be8d3c64d5
- quay.io/coreos/flannel:v0.14.0
sizeBytes: 67927607
- names:
- longhornio/csi-provisioner@sha256:d440c337b7ad7afdc812dc41ae3b56b61022357e0f20b0bc80d0b1916f1f4fc5
- longhornio/csi-provisioner:v1.6.0-lh1
sizeBytes: 48156165
- names:
- longhornio/csi-snapshotter@sha256:8104ea0f3dbe5dd2398978c31dcbf78bf59c74c1aa807943b3d200c9fcb486fb
- longhornio/csi-snapshotter:v2.1.1-lh1
sizeBytes: 46206695
- names:
- longhornio/csi-attacher@sha256:022425bc7a2b80d3c4bfc4d187624e86cf395374c0112834fcc0556860249979
- longhornio/csi-attacher:v2.2.1-lh1
sizeBytes: 44149287
- names:
- longhornio/csi-resizer@sha256:b62f052065a7b473790ad84fc553635ae733136b5c1dbe28cdb9b0320dcbf1b8
- longhornio/csi-resizer:v0.5.1-lh1
sizeBytes: 44083990
- names:
- longhornio/csi-node-driver-registrar@sha256:65c823d262048f728b7f5b933abbcf4e4a2faa0f1c58de1e7abb28b9d878924a
- longhornio/csi-node-driver-registrar:v1.2.0-lh1
sizeBytes: 16388083
- names:
- registry.cn-hangzhou.aliyuncs.com/google_containers/pause@sha256:6c3835cab3980f11b83277305d0d736051c32b17606f5ec59f1dda67c9ba3810
- registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.4.1
sizeBytes: 682696
nodeInfo:
architecture: amd64
bootID: b1c34ce5-a7f3-4aae-8fbb-fe110a4eb1fb
containerRuntimeVersion: docker://20.10.7
kernelVersion: 5.4.0-74-generic
kubeProxyVersion: v1.21.1
kubeletVersion: v1.21.1
machineID: 4d678ef4cd0d4081a377e1b7cd7ec000
operatingSystem: linux
osImage: Ubuntu 20.04.2 LTS
systemUUID: d5beffc8-337a-4fbc-934e-e276b2c42f09

status信息中需要注意,allocatable、capacity和conditions中的所有信息。

Node describe信息

节点的信息除了可以查看yaml信息,还可以通过describe来查看节点状态信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
root@k8s-master01:~# kubectl describe nodes k8s-node01
Name: k8s-node01
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-node01
kubernetes.io/os=linux
Annotations: csi.volume.kubernetes.io/nodeid: {"driver.longhorn.io":"k8s-node01"}
flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"7e:c5:0b:9e:6a:5c"}
flannel.alpha.coreos.com/backend-type: vxlan
flannel.alpha.coreos.com/kube-subnet-manager: true
flannel.alpha.coreos.com/public-ip: 172.16.11.81
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 28 Jun 2021 11:08:03 +0000
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: k8s-node01
AcquireTime: <unset>
RenewTime: Wed, 30 Jun 2021 03:10:46 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Mon, 28 Jun 2021 11:08:48 +0000 Mon, 28 Jun 2021 11:08:48 +0000 FlannelIsUp Flannel is running on this node
MemoryPressure False Wed, 30 Jun 2021 03:07:34 +0000 Mon, 28 Jun 2021 11:08:03 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 30 Jun 2021 03:07:34 +0000 Mon, 28 Jun 2021 11:08:03 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 30 Jun 2021 03:07:34 +0000 Mon, 28 Jun 2021 11:08:03 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 30 Jun 2021 03:07:34 +0000 Mon, 28 Jun 2021 11:08:53 +0000 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 172.16.11.81
Hostname: k8s-node01
Capacity:
cpu: 6
ephemeral-storage: 204340264Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 8152276Ki
pods: 110
Allocatable:
cpu: 6
ephemeral-storage: 188319986991
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 8049876Ki
pods: 110
System Info:
Machine ID: 4d678ef4cd0d4081a377e1b7cd7ec000
System UUID: d5beffc8-337a-4fbc-934e-e276b2c42f09
Boot ID: b1c34ce5-a7f3-4aae-8fbb-fe110a4eb1fb
Kernel Version: 5.4.0-74-generic
OS Image: Ubuntu 20.04.2 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.7
Kubelet Version: v1.21.1
Kube-Proxy Version: v1.21.1
PodCIDR: 10.244.1.0/24
PodCIDRs: 10.244.1.0/24
Non-terminated Pods: (13 in total) # 非终止状态的Pod
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default demoapp-5f7d8f9847-r7h7b 0 (0%) 0 (0%) 0 (0%) 0 (0%) 37h
kube-system kube-flannel-ds-7rzqr 100m (1%) 100m (1%) 50Mi (0%) 50Mi (0%) 40h
kube-system kube-proxy-5splm 0 (0%) 0 (0%) 0 (0%) 0 (0%) 40h
longhorn-system csi-attacher-5dcdcd5984-6frz7 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system csi-attacher-5dcdcd5984-lqvk5 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system csi-provisioner-5c9dfb6446-nxp28 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system csi-resizer-6696d857b6-ptjw6 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system csi-snapshotter-96bfff7c9-z5nrc 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system engine-image-ei-611d1496-chh2q 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system instance-manager-e-ef4e3ece 720m (12%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system instance-manager-r-0b9ddd42 720m (12%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system longhorn-csi-plugin-rdtn8 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
longhorn-system longhorn-manager-7h57m 0 (0%) 0 (0%) 0 (0%) 0 (0%) 24h
Allocated resources: # 已分配的资源。
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1540m (25%) 100m (1%)
memory 50Mi (0%) 50Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>