為什麼 EKS worker node 可以自動加入 EKS cluster(二)?
接續前一篇,我們了解了 EKS node group 定義,並知道了 EKS worker node 使用了 EKS optimized Amazon Linux AMI 內預先設置好的 bootstrap 設定 container runtime、kubelet 等設定。
Kubernetes 為確保 Control Plane 及 worker node 溝通安全性,於 Kubernetes 1.4 導入了 certificate request 及 signing API 機制 [1]。當然 EKS worker node 也皆是基於原生 Kubernetes 架構實現,EKS worker node 使用 kubelet TLS bootstrapping 流程與 API server 進行溝通。
故本文將繼續 EKS worker node 如何讓 kubelet 自動化完成 TLS bootstrap 流程,自動加入 EKS cluster。
kubelet bootstrap 初始化步驟
根據 Kubernetes TLS bootstrapping [2] 文件,kubelet 在 bootstrap 初始化步驟如下:
- kubelet 啟動。
- kubelet 查看是否有相應 kubeconfig 設定檔。
- kueblet 檢視是否有對應 bootstrap-kubeconfig 設定檔。
- 基於 bootstrap 設定檔,取得 API server endpoint 及有限制權限 token。
- 透過上述 token 認證(authenticates)與 API server 建立連線。
- kubelet 使用建立有限制權限 credentials 建立並取得 Certificate Signing Requests(CSR) [3]。
- kubelet 建立 CSR,並將 singerName 設定為 kubernetes.io/kube-apiserver-client-kubelet。
- CSR 被核准。可以透過以下兩種方式:
kube-controller-manager
自動核准 CSR- 透過外部流程或是人為核准,透過 kubectl 或是 Kubernetes API 方式核准 CSR
- kubelet 所需要的憑證(certificate)被建立。
- 憑證(certificate)被發送給 kubelet。
- kubelet 取得該憑證(certificate)。
- kubelet 建立 kubeconfig,其中包含金鑰和已簽名的憑證。
- kubelet 開始正常執行
- (選擇性設定)kubelet 在憑證接近於過期時間將會自動請求更新憑證
- 更新憑證會被核准並發證。
驗證 EKS 設定
在設定上,為達到自動化核准發證流程,我們必須設定以下元件,並需要 Kubernetes Certificate Authority (CA):
- kube-apiserver
- kube-controller-manager
- kubelet
kubelet
首先,透過 systemctl
命令查看 kubelet
systemd unit 設定檔:
[ec2-user@ip-192-168-65-212 ~]$ systemctl cat kubelet
# /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=docker.service iptables-restore.service
Requires=docker.service
[Service]
ExecStartPre=/sbin/iptables -P FORWARD ACCEPT -w 5
ExecStart=/usr/bin/kubelet --cloud-provider aws \
--config /etc/kubernetes/kubelet/kubelet-config.json \
--kubeconfig /var/lib/kubelet/kubeconfig \
--container-runtime docker \
--network-plugin cni $KUBELET_ARGS $KUBELET_EXTRA_ARGS
Restart=always
RestartSec=5
KillMode=process
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/kubelet.service.d/10-kubelet-args.conf
[Service]
Environment='KUBELET_ARGS=--node-ip=192.168.65.212 --pod-infra-container-image=602401143452.dkr.ecr.eu-west-1.amazonaws.com/eks/pause:3.5 --v=2'
# /etc/systemd/system/kubelet.service.d/30-kubelet-extra-args.conf
[Service]
Environment='KUBELET_EXTRA_ARGS=--node-labels=eks.amazonaws.com/sourceLaunchTemplateVersion=1,alpha.eksctl.io/nodegroup-name=ng1-public-ssh,alpha.eksctl.io/cluster-name=ironman-2022,eks.amazonaws.com/nodegroup-image=ami-0ec9e1727a24fb788,eks.a
可以觀察到設定了:
- kubeconfig:
/var/lib/kubelet/kubeconfig
使用了 kubelet username 並透過aws-iam-authenticator
取得 cluster token。
[ec2-user@ip-192-168-65-212 ~]$ cat /var/lib/kubelet/kubeconfig
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /etc/kubernetes/pki/ca.crt
server: https://A8E7A39CAEBEF6AA9250DFA9366FDFA2.gr7.eu-west-1.eks.amazonaws.com
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: kubelet
name: kubelet
current-context: kubelet
users:
- name: kubelet
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: /usr/bin/aws-iam-authenticator
args:
- "token"
- "-i"
- "ironman-2022"
- --region
- "eu-west-1"
此 kubeconfig 也再次驗證了 EKS optimized Amazon Linux AMI 預設安裝了 aws-iam-authenticator
[4] 以提供 EKS worker node 上 kubelet 進行驗證(authenticate)流程。
- kubelet config:
/etc/kubernetes/kubelet/kubelet-config.json
[ec2-user@ip-192-168-90-19 ~]$ cat /etc/kubernetes/kubelet/kubelet-config.json
{
"kind": "KubeletConfiguration",
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"address": "0.0.0.0",
"authentication": {
"anonymous": {
"enabled": false
},
"webhook": {
"cacheTTL": "2m0s",
"enabled": true
},
"x509": {
"clientCAFile": "/etc/kubernetes/pki/ca.crt"
}
},
"authorization": {
"mode": "Webhook",
"webhook": {
"cacheAuthorizedTTL": "5m0s",
"cacheUnauthorizedTTL": "30s"
}
},
"clusterDomain": "cluster.local",
"hairpinMode": "hairpin-veth",
"readOnlyPort": 0,
"cgroupDriver": "cgroupfs",
"cgroupRoot": "/",
"featureGates": {
"RotateKubeletServerCertificate": true
},
"protectKernelDefaults": true,
"serializeImagePulls": false,
"serverTLSBootstrap": true,
"tlsCipherSuites": [
"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256",
"TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256",
"TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305",
"TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384",
"TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305",
"TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384",
"TLS_RSA_WITH_AES_256_GCM_SHA384",
"TLS_RSA_WITH_AES_128_GCM_SHA256"
],
"clusterDNS": [
"10.100.0.10"
],
"evictionHard": {
"memory.available": "100Mi",
"nodefs.available": "10%",
"nodefs.inodesFree": "5%"
},
"kubeReserved": {
"cpu": "70m",
"ephemeral-storage": "1Gi",
"memory": "574Mi"
}
}
在這些 kubelet flag [5] 其中我們關注與 TLS Bootstrap 有關 flag:
serverTLSBootstrap: true
:此將允許來自certificates.k8s.io
API 給予 kubelet 使用 kubelet serving certificates。然而此為一個 已知安全性限制 [6],此類型 certificate 無法透過kube-controller-manager - kubernetes.io/kubelet-serving
自動化核准 CSR,則需要仰賴使用者或是第三方 controller。featureGates.RotateKubeletServerCertificate: ture
:kubelet 在 bootstrapping 自身 client credentials 後請求 serving certificate 並自動替換憑證。
kube-apiserver
The kube-apiserver 需要以下三點條件才能啟用 TLS bootstrapping:
- 設別 client certificate 的 CA
- 驗證(authenticate) bootstrapping kubelet 並設定於
system: bootstrappers
group - 授權(authorize) bootstrapping kubelet 建立 CSR
其中驗證流程,由上述提及透過 EKS optimized Amazon Linux AMI 所定義 bootstrap.sh script 定義 kubeconfig 使用 aws-iam-authenticator
。而 kubelet 在經由 API server 認證後,由 RBAC system: node
role 和 Node authorizer [7] 將允許節點建立和讀取 CSRs。因此我們可以在 EKS 預設 eks: node-bootstrapper
role 上檢視:
- 給予 kubelet 權限提交 CSR
$ kubectl get clusterrole eks:node-bootstrapper -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"labels":{"eks.amazonaws.com/component":"node"},"name":"eks:node-bootstrapper"},"rules":[{"apiGroups":["certificates.k8s.io"],"resources":["certificatesigningrequests/selfnodeserver"],"verbs":["create"]}]}
creationTimestamp: "2022-09-14T09:46:17Z"
labels:
eks.amazonaws.com/component: node
name: eks:node-bootstrapper
resourceVersion: "283"
uid: eb23d8fe-dfdf-4f01-aba7-72ca32b52ad7
rules:
- apiGroups:
- certificates.k8s.io
resources:
- certificatesigningrequests/selfnodeserver
verbs:
- create
- cluster role
eks: node-bootstrapper
綁定system: bootstrappers
及system: nodes
group
$ kubectl get clusterrolebindings eks:node-bootstrapper -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRoleBinding","metadata":{"annotations":{},"labels":{"eks.amazonaws.com/component":"node"},"name":"eks:node-bootstrapper"},"roleRef":{"apiGroup":"rbac.authorization.k8s.io","kind":"ClusterRole","name":"eks:node-bootstrapper"},"subjects":[{"apiGroup":"rbac.authorization.k8s.io","kind":"Group","name":"system:bootstrappers"},{"apiGroup":"rbac.authorization.k8s.io","kind":"Group","name":"system:nodes"}]}
creationTimestamp: "2022-09-14T09:46:16Z"
labels:
eks.amazonaws.com/component: node
name: eks:node-bootstrapper
resourceVersion: "282"
uid: 867196bc-b84a-410d-8d4b-cbf52d840108
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: eks:node-bootstrapper
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:bootstrappers
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:nodes
- 透過 CloudWatch Logs insight syntax 檢視
kube-apiserver
logs 一樣可以查看到--authorization-mode
使用了 Node 及 RBAC 兩種授權模式。
filter @logStream not like /^kube-apiserver-audit/
| filter @logStream like /^kube-apiserver-/
| fields @timestamp, @message
| sort @timestamp asc
| filter @message like "--authorization-mode"
| limit 10000
I0914 09:46:06.295709 10 flags.go:59] FLAG: --authorization-mode="[Node,RBAC]"
kube-controller-manager
EKS 使用了 kubelet serving certificates 方式,此方式僅能透過第三方 controller 方式核准 CSR。此部分,我們也可以再次使用 CloudWatch Logs insight syntax 語法檢視此 flag 可以觀察到停用原生 Kubernetes csrsigning
controller。
filter @logStream not like /^kube-controller-manager/
| fields @timestamp, @message
| sort @timestamp asc
| filter @message like "--controller"
| limit 10000
I0914 09:51:40.684692 11 flags.go:59] FLAG: --controllers="[*,-csrsigning]"
實際啟用 EKS 節點
我們透過 eksctl scale
命令 scale up 一個 node 來查看檢視相應元件設定,及 TLS bootstrap 流程:
$ eksctl scale ng --nodes=3 --name=ng1-public-ssh --cluster=ironman-2022
2022-09-19 09:44:52 [ℹ] scaling nodegroup "ng1-public-ssh" in cluster ironman-2022
2022-09-19 09:44:53 [ℹ] waiting for scaling of nodegroup "ng1-public-ssh" to complete
2022-09-19 09:46:31 [ℹ] nodegroup successfully scaled
透過 kubectl get CSR 可以先觀察到 由 node system: node: ip-192-168-65-212.eu-west-1.compute.internal
作為 requestor 發起 CSR。約 11 秒後,此 CSR 被 Approved。接續,CSR 被 Issued 給 node。
$ kubectl get node
NAME STATUS ROLES AGE VERSION
...
ip-192-168-65-212.eu-west-1.compute.internal NotReady <none> 0s v1.22.12-eks-ba74326
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-kdkll 11s kubernetes.io/kubelet-serving system:node:ip-192-168-65-212.eu-west-1.compute.internal <none> Approved
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-kdkll 16s kubernetes.io/kubelet-serving system:node:ip-192-168-65-212.eu-west-1.compute.internal <none> Approved,Issued
$ kubectl get node
NAME STATUS ROLES AGE VERSION
node/ip-192-168-18-254.eu-west-1.compute.internal Ready <none> 4d17h v1.22.12-eks-ba74326
node/ip-192-168-40-16.eu-west-1.compute.internal Ready <none> 19h v1.22.12-eks-ba74326
node/ip-192-168-65-212.eu-west-1.compute.internal Ready <none> 43s v1.22.12-eks-ba74326
$ kubectl describe csr csr-kdkll
Name: csr-kdkll
Labels: <none>
Annotations: <none>
CreationTimestamp: Mon, 19 Sep 2022 09:46:59 +0000
Requesting User: system:node:ip-192-168-65-212.eu-west-1.compute.internal
Signer: kubernetes.io/kubelet-serving
Status: Approved,Issued
Subject:
Common Name: system:node:ip-192-168-65-212.eu-west-1.compute.internal
Serial Number:
Organization: system:nodes
Subject Alternative Names:
DNS Names: ec2-52-211-162-59.eu-west-1.compute.amazonaws.com
ip-192-168-65-212.eu-west-1.compute.internal
IP Addresses: 192.168.65.212
52.211.162.59
Events: <none>
由於並未能查看到 EKS 是否有使用第三方 controller 進 行核准(approve)CSR。我們透過 CloudWatch Logs insight syntax 語法檢視 kube-apiserver-audit
日誌內記錄 CSR csr-kdkll
流程變化。
filter @logStream like /^kube-apiserver-audit/
| fields @timestamp, @message
| sort @timestamp asc
| filter @message like 'csr-kdkll'
| limit 10000
- 2022-09-19 09:46:59 UTC+0:
kubelet/v1.22.12 (linux/amd64) kubernetes/1fc8914
使用了建立了 CSRcsr-kdkll
user.username
:`system: node: ip-192-168-65-212.eu-west-1.compute.internaluser.uid
:aws-iam-authenticator:111111111111: AROAYFMQSNSE3QYOZUIO6
responseObject.spec.signerName
:kubernetes.io/kubelet-serving
responseObject.spec.usages
digital signature
key encipherment
server auth
- 2022-09-19 09:46:59 UTC+0:
eks: certificate-controller
更新 CSR 並核准(approve)CSRuser.username
:eks: certificate-controller
responseObject.status.conditions.0.message
:Auto approving self kubelet server certificate after SubjectAccessReview.
responseObject.status.conditions.0.reason
:AutoApproved
由上述可以了解此 CSR 符合 Kubernetes signers signerName
kubernetes.io/kubelet-serving
而不被 kube-controller-manager
自動核准,此 CSR 則是由 EKS eks: certificate-controller
來自動核准。
而經由 CSR 核准並發證於 node 上。我們也可以於 /var/lib/kubelet/pki/
目錄查看到 kubelet-server-current.pem
憑證。
[ec2-user@ip-192-168-65-212 ~]$ journalctl -u kubelet | grep "certificate signing request"
Sep 19 09:47:00 ip-192-168-65-212.eu-west-1.compute.internal kubelet[3347]: I0919 09:47:00.509710 3347 csr.go:262] certificate signing request csr-kdkll is approved, waiting to be issued
Sep 19 09:47:14 ip-192-168-65-212.eu-west-1.compute.internal kubelet[3347]: I0919 09:47:14.707915 3347 csr.go:258] certificate signing request csr-kdkll is issued
[ec2-user@ip-192-168-65-212 ~]$ sudo ls -al /var/lib/kubelet/pki/
total 4
drwxr-xr-x 2 root root 86 Sep 14 16:39 .
drwxr-xr-x 8 root root 182 Sep 14 16:39 ..
-rw------- 1 root root 1370 Sep 14 16:39 kubelet-server-2022-09-14-16-39-07.pem
lrwxrwxrwx 1 root root 59 Sep 14 16:39 kubelet-server-current.pem -> /var/lib/kubelet/pki/kubelet-server-2022-09-14-16-39-07.pem
總結
比對原生 Kubernetes 文件及 EKS 上 kube-apiserver、 kube-controller-manager 及 kubelet 元件設定後,我們可以確定 EKS 於 Control Plane 端設置了 eks: certificate-controller
提供自動核准(Auto-Aprroved)CSR 並 issue CSR 給予 kubelet 使用。在取得 CSR 後,node 則可以順利加入 EKS cluster。
參考文件
- Add proposal for kubelet TLS bootstrap - https://github.com/kubernetes/kubernetes/pull/20439/files
- TLS bootstrapping - https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/
- Certificate Signing Requests - https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/
- AWS IAM Authenticator for Kubernetes - https://github.com/kubernetes-sigs/aws-iam-authenticator
- kubelet - https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/
- design: reduce scope of node on node object w.r.t ip #1982 - https://github.com/kubernetes/community/pull/1982
- Using Node Authorization - https://kubernetes.io/docs/reference/access-authn-authz/node/
- https://kubernetes.io/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/#certificate-rotation