跳至主要内容

[15] 為什麼 security group 可以關聯至單獨的 EKS Pod - Security groups for pods(二)

前一篇,我們確認兩件事情恰好分別於 Node 及 Pod 資源都有被更新對應 vpc.amazonaws.com/pod-eni 資源。以下將針對 node 及 AWS resource 分析:

分析實作

複習一下 Pod annotation vpc.amazonaws.com/pod-eni 資訊:

vpc.amazonaws.com/pod-eni: [{"eniId":"eni-09a7a909e06fc4be6","ifAddress":"02:b0:01:6c:b5:79","privateIp":"192.168.26.162","vlanId":1,"subnetCidr":"192.168.0.0/19"}]

上述資訊提供了:

  • Elastic network interfaces(ENI)id:eni-09a7a909e06fc4be6
  • 使用 vlan ID 為 1
  • subnet CIDR:192.168.0.0/19

EKS work node

為了驗證 aws-cli Pod 是否有使用 vlan,先透過 SSH 登入至該 EKS work node 上,並查看對應網卡設定及 ip rule

[ec2-user@ip-192-168-31-166 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:a5:ee:3d:4a:c9 brd ff:ff:ff:ff:ff:ff
inet 192.168.31.166/19 brd 192.168.31.255 scope global dynamic eth0
valid_lft 2374sec preferred_lft 2374sec
inet6 fe80::a5:eeff:fe3d:4ac9/64 scope link
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
link/ether 02:27:6e:aa:23:15 brd ff:ff:ff:ff:ff:ff
inet 192.168.9.59/19 brd 192.168.31.255 scope global eth1
valid_lft forever preferred_lft forever
inet6 fe80::27:6eff:feaa:2315/64 scope link
valid_lft forever preferred_lft forever
4: vlanafd4c544f81@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default
link/ether 3e:fa:50:37:79:5b brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::3cfa:50ff:fe37:795b/64 scope link
valid_lft forever preferred_lft forever
5: vlan.eth.1@eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default qlen 1000
link/ether 02:b0:01:6c:b5:79 brd ff:ff:ff:ff:ff:ff
inet6 fe80::b0:1ff:fe6c:b579/64 scope link
valid_lft forever preferred_lft forever
[ec2-user@ip-192-168-31-166 ~]$ ip route show table all
default via 192.168.0.1 dev eth1 table 3
192.168.0.1 dev eth1 table 3 scope link
default via 192.168.0.1 dev vlan.eth.1 table 101
192.168.0.1 dev vlan.eth.1 table 101 scope link
192.168.26.162 dev vlanafd4c544f81 table 101 scope link
default via 192.168.0.1 dev eth0
169.254.169.254 dev eth0
192.168.0.0/19 dev eth0 proto kernel scope link src 192.168.31.166
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1
broadcast 192.168.0.0 dev eth0 table local proto kernel scope link src 192.168.31.166
broadcast 192.168.0.0 dev eth1 table local proto kernel scope link src 192.168.9.59
local 192.168.9.59 dev eth1 table local proto kernel scope host src 192.168.9.59
local 192.168.31.166 dev eth0 table local proto kernel scope host src 192.168.31.166
broadcast 192.168.31.255 dev eth0 table local proto kernel scope link src 192.168.31.166
broadcast 192.168.31.255 dev eth1 table local proto kernel scope link src 192.168.9.59
unreachable ::/96 dev lo metric 1024 pref medium
unreachable ::ffff:0.0.0.0/96 dev lo metric 1024 pref medium
unreachable 2002:a00::/24 dev lo metric 1024 pref medium
unreachable 2002:7f00::/24 dev lo metric 1024 pref medium
unreachable 2002:a9fe::/32 dev lo metric 1024 pref medium
unreachable 2002:ac10::/28 dev lo metric 1024 pref medium
unreachable 2002:c0a8::/32 dev lo metric 1024 pref medium
unreachable 2002:e000::/19 dev lo metric 1024 pref medium
unreachable 3ffe:ffff::/32 dev lo metric 1024 pref medium
fe80::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev eth1 proto kernel metric 256 pref medium
fe80::/64 dev vlanafd4c544f81 proto kernel metric 256 pref medium
fe80::/64 dev vlan.eth.1 proto kernel metric 256 pref medium
local ::1 dev lo table local proto kernel metric 0 pref medium
local fe80::27:6eff:feaa:2315 dev eth1 table local proto kernel metric 0 pref medium
local fe80::a5:eeff:fe3d:4ac9 dev eth0 table local proto kernel metric 0 pref medium
local fe80::b0:1ff:fe6c:b579 dev vlan.eth.1 table local proto kernel metric 0 pref medium
local fe80::3cfa:50ff:fe37:795b dev vlanafd4c544f81 table local proto kernel metric 0 pref medium
multicast ff00::/8 dev eth0 table local proto kernel metric 256 pref medium
multicast ff00::/8 dev eth1 table local proto kernel metric 256 pref medium
multicast ff00::/8 dev vlanafd4c544f81 table local proto kernel metric 256 pref medium
multicast ff00::/8 dev vlan.eth.1 table local proto kernel metric 256 pref medium

上述資訊可以知道:

  • 多了一張網卡 eth1 IP 192.168.9.59
  • vlan.eth.1@eth1 對接網卡 eth1
  • IP 192.168.26.162 路由至 vlan vlanafd4c544f81,此 IP 正也是 Pod aws-cli 所使用。

AWS resource

根據 Pod annotation eni id eni-09a7a909e06fc4be6 調查:

$ aws ec2 describe-network-interfaces --network-interface-ids eni-09a7a909e06fc4be6
{
"NetworkInterfaces": [
{
"AvailabilityZone": "eu-west-1a",
"Description": "aws-k8s-branch-eni",
"Groups": [
{
"GroupName": "eksctl-ironman-2022-nodegroup-ng1-public-ssh-remoteAccess",
"GroupId": "sg-079157a483bcb371f"
}
],
"InterfaceType": "branch",
"Ipv6Addresses": [],
"MacAddress": "02:b0:01:6c:b5:79",
"NetworkInterfaceId": "eni-09a7a909e06fc4be6",
"OwnerId": "111111111111",
"PrivateDnsName": "ip-192-168-26-162.eu-west-1.compute.internal",
"PrivateIpAddress": "192.168.26.162",
"PrivateIpAddresses": [
{
"Primary": true,
"PrivateDnsName": "ip-192-168-26-162.eu-west-1.compute.internal",
"PrivateIpAddress": "192.168.26.162"
}
],
"RequesterId": "014851456324",
"RequesterManaged": false,
"SourceDestCheck": true,
"Status": "in-use",
"SubnetId": "subnet-0e863f9fbcda592a3",
"TagSet": [
{
"Key": "eks:eni:owner",
"Value": "eks-vpc-resource-controller"
},
{
"Key": "kubernetes.io/cluster/ironman-2022",
"Value": "owned"
},
{
"Key": "vpcresources.k8s.aws/trunk-eni-id",
"Value": "eni-01f7b73cf13962fe5"
},
{
"Key": "vpcresources.k8s.aws/vlan-id",
"Value": "1"
}
],
"VpcId": "vpc-0b150640c72b722db"
}
]
}

依據輸出內容及 tag,此為 aws-k8s-branch-eni 且 owner 為 eks-vpc-resource-controller,並且提供了 trunk eni id eni-01f7b73cf13962fe5

接續查看此 eni eni-01f7b73cf13962fe5 資訊:

$ aws ec2 describe-network-interfaces --network-interface-ids eni-01f7b73cf13962fe5
{
"NetworkInterfaces": [
{
"Attachment": {
"AttachTime": "2022-09-29T10:29:58+00:00",
"AttachmentId": "eni-attach-0f6699c0476404a29",
"DeleteOnTermination": true,
"DeviceIndex": 2,
"NetworkCardIndex": 0,
"InstanceId": "i-01ac2403f4932c7ee",
"InstanceOwnerId": "111111111111",
"Status": "attached"
},
"AvailabilityZone": "eu-west-1a",
"Description": "aws-k8s-trunk-eni",
"Groups": [
{
"GroupName": "eksctl-ironman-2022-nodegroup-ng1-public-ssh-remoteAccess",
"GroupId": "sg-079157a483bcb371f"
},
{
"GroupName": "eks-cluster-sg-ironman-2022-961501171",
"GroupId": "sg-032f6354cf0b25196"
}
],
"InterfaceType": "trunk",
"Ipv6Addresses": [],
"MacAddress": "02:27:6e:aa:23:15",
"NetworkInterfaceId": "eni-01f7b73cf13962fe5",
"OwnerId": "111111111111",
"PrivateDnsName": "ip-192-168-9-59.eu-west-1.compute.internal",
"PrivateIpAddress": "192.168.9.59",
"PrivateIpAddresses": [
...
...
...
],
"RequesterId": "014851456324",
"RequesterManaged": false,
"SourceDestCheck": true,
"Status": "in-use",
"SubnetId": "subnet-0e863f9fbcda592a3",
"TagSet": [
{
"Key": "kubernetes.io/cluster/ironman-2022",
"Value": "owned"
},
{
"Key": "eks:eni:owner",
"Value": "eks-vpc-resource-controller"
}
],
"VpcId": "vpc-0b150640c72b722db"
}
]
}
  • 2022-09-29T10:29:58+00:00 時間被關聯至 instance i-01ac2403f4932c7ee,也就是設定 AWS VPC CNI plugin 環境變數 ENABLE_POD_ENI 時間左右。
  • eks: eni: ownereks-vpc-resource-controller

流程

上述驗證皆是透過 Kubernetes Audit 或是 CloudTrail log 來驗證,其 AWS 官方 blog [1] 整理 workflow 如下:

初始化 Node 並更新 branch interface 資源

  1. 客戶端更新 VPC CNI plugin ENABLE_POD_ENI
  2. VPC CNI plugin 元件 IP address management daemon (ipamd) 更新 Kubernetes label vpc.amazonaws.com/has-trunk-attached=false
  3. eks-vpc-resource-controller 建立 trunk interface 並關聯至 node 上。
  4. 更新 vpc.amazonaws.com/pod-eni 作為 Node extended resources [2] 至 API server。
  5. eks-vpc-resource-controller 更新 vpc.amazonaws.com/has-trunk-attached=true

來源:Introducing security groups for pods

調度 pods 至 nodes

  1. 客戶端建立 SecurityGroupPolicy CustomResourceDefinitions [3] 及 Pod 資源,使用 label 關聯。
  2. Mutating webhook vpc-resource-mutating-webhook 更新 Pod limit/request vpc.amazonaws.com/pod-eni=1
  3. eks-vpc-resource-controller 建立並關聯 branch interface。
  4. eks-vpc-resource-controller 更新 Pod annotation 並更新 branch interface 資訊。

來源:Introducing security groups for pods

設定 Pod network

此部分流程與原先 VPC CNI plugin 設定 Pod 及 host routing 流程類似,細節可以參考 Proposal: CNI plugin for Kubernetes networking over AWS VPC [4]:

  1. kubelet 透過 CNI binary 設定 Pod network。
  2. CNI 向 ipmand 取得 branch eni 資訊
  3. ipmand 透過 Kubernetes API 讀取 Pod annotation。
  4. ipmand 更新 branch eni 資訊至 Pod annotation。
  5. CNI binary 於 worker node 建立 vlan device、route table、host-veth pair 路由規則。

來源:Introducing security groups for pods

總結

從作業系統角度,VPC CNI plugin 透過 vlan 及 host-veth device interface 來串接網卡設定,並將對應網卡 IP 作為關聯;而從 VPC 環境角度,則是透過 AWS Nitro based EC2 ENI trunking/branching 功能實現指定 ENI,因此可以透過 ENI 指定 security group。

參考文件

  1. Introducing security groups for pods - https://aws.amazon.com/blogs/containers/introducing-security-groups-for-pods/
  2. Advertise Extended Resources for a Node - https://kubernetes.io/docs/tasks/administer-cluster/extended-resource-node/
  3. Extend the Kubernetes API with CustomResourceDefinitions - https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/
  4. Proposal: CNI plugin for Kubernetes networking over AWS VPC - https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/cni-proposal.md