[13] 為什麼透過 CloudFormation template 或 ekstcl
啟用的 self-managed node 可以自動加入 EKS cluster
在先前 Day3 Day4 時,我們討論到 Amazon EKS AMI Build Specification script 定義了自動化設置 kubelet 設定檔,使 node 可以順利地加入節點。
故本文章將探討 CloudFormation template 及 eksctl
命令是如何根據此 bootstrap.sh 啟用 worker node。
CloudFormation template
若我們使用 CloudFormation 方式建立 self-managed nodes,我們可以使用以下 curl
command 下載官方提供的 CloudFormation template。
curl -o amazon-eks-nodegroup.yaml https://s3.us-west-2.amazonaws.com/amazon-eks/cloudformation/2020-10-29/amazon-eks-nodegroup.yaml
以下列舉部分 CloudFormation template 資訊:
NodeLaunchTemplate:
Type: "AWS::EC2::LaunchTemplate"
Properties:
LaunchTemplateData:
BlockDeviceMappings:
- DeviceName: /dev/xvda
Ebs:
DeleteOnTermination: true
VolumeSize: !Ref NodeVolumeSize
VolumeType: gp2
IamInstanceProfile:
Arn: !GetAtt NodeInstanceProfile.Arn
ImageId: !If
- HasNodeImageId
- !Ref NodeImageId
- !Ref NodeImageIdSSMParam
InstanceType: !Ref NodeInstanceType
KeyName: !Ref KeyName
SecurityGroupIds:
- !Ref NodeSecurityGroup
UserData: !Base64
"Fn::Sub": |
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh ${ClusterName} ${BootstrapArguments}
/opt/aws/bin/cfn-signal --exit-code $? \
--stack ${AWS::StackName} \
--resource NodeGroup \
--region ${AWS::Region}
MetadataOptions:
HttpPutResponseHopLimit : 2
HttpEndpoint: enabled
HttpTokens: !If
- IMDSv1Disabled
- required
- optional
NodeGroup:
Type: "AWS::AutoScaling::AutoScalingGroup"
Properties:
DesiredCapacity: !Ref NodeAutoScalingGroupDesiredCapacity
LaunchTemplate:
LaunchTemplateId: !Ref NodeLaunchTemplate
Version: !GetAtt NodeLaunchTemplate.LatestVersionNumber
MaxSize: !Ref NodeAutoScalingGroupMaxSize
MinSize: !Ref NodeAutoScalingGroupMinSize
Tags:
- Key: Name
PropagateAtLaunch: true
Value: !Sub ${ClusterName}-${NodeGroupName}-Node
- Key: !Sub kubernetes.io/cluster/${ClusterName}
PropagateAtLaunch: true
Value: owned
VPCZoneIdentifier: !Ref Subnets
UpdatePolicy:
AutoScalingRollingUpdate:
MaxBatchSize: 1
MinInstancesInService: !Ref NodeAutoScalingGroupDesiredCapacity
PauseTime: PT5M
其中我們可以觀察到, CloudFormation Logical ID [5] NodeGroup
使用 AWS:: AutoScaling:: AutoScalingGroup
[6] 而非 AWS:: EKS:: Nodegroup
[7]。值得注意的是,AWS:: EKS:: Nodegroup
啟用的是 managed node group。
接續,NodeGroup
關聯了 NodeLaunchTemplate
其定義使用了 EC2 userdata [8],並使用了 cfn-signal
[9]--exit-code
來確定此 bootstrap.sh 是否成功。
eksctl
透過 eksctl
建立 self-managed node group 資源後,可以查看到會建立對應的 CloudFormation Stack。故我們也能使用以下 AWS CLI aws cloudformation describe-stack-resources
命令,與 CloudFormation 所建立的方式類似,也是透過定義 Auto Scaling Group 及定義 Launch Template 來建立。
$ aws cloudformation describe-stack-resources \
--stack-name eksctl-ironman-2022-nodegroup-demo-self-ng
...
...
...
{
"StackName": "eksctl-ironman-2022-nodegroup-demo-self-ng",
"StackId": "arn:aws:cloudformation:eu-west-1:111111111111:stack/eksctl-ironman-2022-nodegroup-demo-self-ng/e2083840-3cda-11ed-882d-026b9e583afb",
"LogicalResourceId": "NodeGroup",
"PhysicalResourceId": "eksctl-ironman-2022-nodegroup-demo-self-ng-NodeGroup-1Q19K6MP0HTFJ",
"ResourceType": "AWS::AutoScaling::AutoScalingGroup",
"Timestamp": "2022-09-25T14:07:16.122000+00:00",
"ResourceStatus": "CREATE_COMPLETE",
"DriftInformation": {
"StackResourceDriftStatus": "NOT_CHECKED"
}
},
{
"StackName": "eksctl-ironman-2022-nodegroup-demo-self-ng",
"StackId": "arn:aws:cloudformation:eu-west-1:111111111111:stack/eksctl-ironman-2022-nodegroup-demo-self-ng/e2083840-3cda-11ed-882d-026b9e583afb",
"LogicalResourceId": "NodeGroupLaunchTemplate",
"PhysicalResourceId": "lt-011b90bb7c3e105d3",
"ResourceType": "AWS::EC2::LaunchTemplate",
"Timestamp": "2022-09-25T14:06:37.330000+00:00",
"ResourceStatus": "CREATE_COMPLETE",
"DriftInformation": {
"StackResourceDriftStatus": "NOT_CHECKED"
}
},
...
...
接著,我們可以透過 aws cloudformation get-template
查看原先的 userdata 定義,此輸出看起來像是經 base64
encode。
$ aws cloudformation get-template --stack-name eksctl-ironman-2022-nodegroup-demo-self-ng --query TemplateBody.Resources.NodeGroupLaunchTemplate.Properties.LaunchTemplateData.UserData
"H4sIAAAAAAAA/7xYbXPiOrL+nl+hy0ydnLl3DLaBnCFV3FpebMKLTSxbMnh2KiUsBb/KHlsEwmz++5YhZMJMztSera39QmLpedqtVquflt/5Sbahkp/x+3B9kRM/JmtWXgO+SZKLYsP9lF5fSEACjQdSNJJw1TgQGqVfhLkoGywufZE0VlkmSlGQvE4StV4Gf4kSsCRnRcXaFqFgd/dh...
... SKIP CONTENT ...
...
pIBWeClafi1HhDti7fcOjMzJ9+cDgyz4Sme9kQaX72mlfQn78lvCXF19LzF4anGngHKLsnm+S4HyRRQbQpBQg58EnJPgKeiUqcaTWyWW242PwndOV7e/W2tPwzAAD//+PPRyJOEgAA"
因此我們可以將此輸出內容透過 base64
decode,透過 file
查看到此 decode 格式為 gzip 的壓縮格式。
$ aws cloudformation get-template --stack-name eksctl-ironman-2022-nodegroup-demo-self-ng --query TemplateBody.Resources.NodeGroupLaunchTemplate.Properties.LaunchTemplateData.UserData --output text | base64 -d > ec2-userdata
$ file ec2-userdata
ec2-userdata: gzip compressed data
而透過 gunzip
解壓縮此文件後,確認此文件為 cloud-init
格式,並且一樣定義了 bootstrap.sh
。
$ mv ec2-userdata ec2-userdata.gz
$ gunzip ./ec2-userdata.gz
$ cat ./ec2-userdata
#cloud-config
packages: null
runcmd:
- - /var/lib/cloud/scripts/eksctl/bootstrap.al2.sh
- - /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
write_files:
- content: '{}'
owner: root:root
path: /etc/eksctl/kubelet-extra.json
permissions: "0644"
- content: |-
NODE_TAINTS=
CLUSTER_DNS=10.100.0.10
CONTAINER_RUNTIME=dockerd
CLUSTER_NAME=ironman-2022
API_SERVER_URL=https://XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX.gr7.eu-west-1.eks.amazonaws.com
B64_CLUSTER_CA=LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFR...
... ...
V1lZRExMUDdLCm0xVUJQUWdzTzRQQlREUjlaLzhpbnZDV0FiT0szM2Z6OVZqU3dBbjlhQ0lXbU5FY2dVMkFUWm1FN0N4WEUrbFkKOFhjPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==
NODE_LABELS=alpha.eksctl.io/cluster-name=ironman-2022,alpha.eksctl.io/nodegroup-name=demo-self-ng
owner: root:root
path: /etc/eksctl/kubelet.env
permissions: "0644"
- content: |
#!/bin/bash
set -o errexit
set -o pipefail
set -o nounset
source /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
echo "eksctl: running /etc/eks/bootstrap"
/etc/eks/bootstrap.sh "${CLUSTER_NAME}" \
--apiserver-endpoint "${API_SERVER_URL}" \
--b64-cluster-ca "${B64_CLUSTER_CA}" \
--dns-cluster-ip "${CLUSTER_DNS}" \
--kubelet-extra-args "${KUBELET_EXTRA_ARGS}" \
--container-runtime "${CONTAINER_RUNTIME}"
echo "eksctl: merging user options into kubelet-config.json"
trap 'rm -f ${TMP_KUBE_CONF}' EXIT
jq -s '.[0] * .[1]' "${KUBELET_CONFIG}" "${KUBELET_EXTRA_CONFIG}" > "${TMP_KUBE_CONF}"
mv "${TMP_KUBE_CONF}" "${KUBELET_CONFIG}"
systemctl daemon-reload
echo "eksctl: restarting kubelet-eks"
systemctl restart kubelet
echo "eksctl: done"
owner: root:root
path: /var/lib/cloud/scripts/eksctl/bootstrap.al2.sh
permissions: "0755"
- content: |
#!/bin/bash
set -o errexit
set -o pipefail
set -o nounset
source /etc/eksctl/kubelet.env # file written by bootstrapper
# Use IMDSv2 to get metadata
TOKEN="$(curl --silent -X PUT -H "X-aws-ec2-metadata-token-ttl-seconds: 600" http://169.254.169.254/latest/api/token)"
function get_metadata() {
curl --silent -H "X-aws-ec2-metadata-token: $TOKEN" "http://169.254.169.254/latest/meta-data/$1"
}
API_SERVER_URL="${API_SERVER_URL}"
B64_CLUSTER_CA="${B64_CLUSTER_CA}"
INSTANCE_ID="$(get_metadata instance-id)"
INSTANCE_LIFECYCLE="$(get_metadata instance-life-cycle)"
CLUSTER_DNS="${CLUSTER_DNS:-}"
NODE_TAINTS="${NODE_TAINTS:-}"
MAX_PODS="${MAX_PODS:-}"
NODE_LABELS="${NODE_LABELS},node-lifecycle=${INSTANCE_LIFECYCLE},alpha.eksctl.io/instance-id=${INSTANCE_ID}"
KUBELET_ARGS=("--node-labels=${NODE_LABELS}")
[[ -n "${NODE_TAINTS}" ]] && KUBELET_ARGS+=("--register-with-taints=${NODE_TAINTS}")
# --max-pods as a CLI argument is deprecated, this is a workaround until we deprecate support for maxPodsPerNode
[[ -n "${MAX_PODS}" ]] && KUBELET_ARGS+=("--max-pods=${MAX_PODS}")
KUBELET_EXTRA_ARGS="${KUBELET_ARGS[@]}"
CLUSTER_NAME="${CLUSTER_NAME}"
KUBELET_CONFIG='/etc/kubernetes/kubelet/kubelet-config.json'
KUBELET_EXTRA_CONFIG='/etc/eksctl/kubelet-extra.json'
TMP_KUBE_CONF='/tmp/kubelet-conf.json'
CONTAINER_RUNTIME="${CONTAINER_RUNTIME:-dockerd}" # default for al2 just in case, not used in ubuntu
owner: root:root
path: /var/lib/cloud/scripts/eksctl/bootstrap.helper.sh
permissions: "0755"
根據 eksctl
nodebootstrap 文件,上述 bootstrapping/userdata 則是由 nodebootstrap 定義的邏輯,又依據不同作業系統,如 Ubuntu、Amazon Linux 劃分 ubuntu.go
及 al2.go
。因此 UserData 會動態地設定以下設定檔:
kubelet-extra.json
- kubelet user configurationdocker-extra.json
- Docker Daemon extra configkubelet.env
:kubelet 環境變數 vars for kubelet
總結
不論是透過官方文件所提供 CloudFormation template 或是 eksctl
命令方式所建立的 CloudFormation Stack,皆使用了 userdata 方式設定 EKS worker node AMI 設置 bootstrap.sh script 設定必要參數。而 eksctl
更近一步地在 provision 階段,透過 nodebootstrap
pkg 動態建立 cloud-init 設定檔。
參考文件
- [03] 為什麼 EKS worker node 可以自動加入 EKS cluster(一) - https://ithelp.ithome.com.tw/articles/10293460
- [04] 為什麼 EKS worker node 可以自動加入 EKS cluster(二) - https://ithelp.ithome.com.tw/articles/10294101
- Amazon EKS AMI Build Specification - https://github.com/awslabs/amazon-eks-ami
- Launching self-managed Amazon Linux nodes - https://docs.aws.amazon.com/eks/latest/userguide/launch-workers.html
- Resources - Resource fields - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/resources-section-structure.html#resources-section-structure-resource-fields
- AWS:: AutoScaling:: AutoScalingGroup - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-group.html
- AWS:: EKS:: Nodegroup - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-eks-nodegroup.html
- Run commands on your Linux instance at launch - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
- cfn-signal - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-signal.html
- Userdata - https://cloudbase-init.readthedocs.io/en/latest/userdata.html
- https://github.com/weaveworks/eksctl/tree/main/pkg/nodebootstrap