如何從 CloudWatch Logs 中擷取 Amazon EKS 控制平面日誌?

4 分的閱讀內容
0

我正在排查 Amazon Elastic Kubernetes Service (Amazon EKS) 問題,並且需要從 EKS 控制平面上執行的元件收集日誌。

簡短描述

若要檢視 Amazon CloudWatch Logs 中的日誌,您必須開啟 Amazon EKS 控制平面日誌記錄。您可以在 /aws/eks/cluster-name/cluster 日誌群組中找到 EKS 控制平面日誌。如需詳細資訊,請參閱檢視叢集控制平面日誌

**注意:**將 cluster-name 取代為您的叢集名稱。

您可以使用 CloudWatch Logs Insights 來搜尋 EKS 控制平面日誌資料。如需詳細資訊,請參閱使用 CloudWatch Insights 分析日誌資料

**重要事項:**只有在叢集中開啟控制平面日誌記錄之後,才能在 CloudWatch Logs 中檢視日誌事件。在 CloudWatch Logs Insights 中選取要執行查詢的時間範圍之前,請確認您已開啟控制平面日誌記錄。

解決方案

搜尋 CloudWatch Insights

  1. 開啟 CloudWatch 主控台。
  2. 在導覽窗格中,選擇 Logs (日誌),然後選擇 Log Insights
  3. Select log group(s) (日誌群組) 功能表上,選取要查詢的 cluster log group (叢集日誌群組)。
  4. 選擇 Run (執行) 以檢視結果。

**注意:**若要將結果匯出為 .csv 檔案,或將結果複製到剪貼簿,請選擇 Export results (匯出結果)。您可以變更範例查詢以取得特定使用案例的資料。請參閱這些常見 EKS 使用案例的查詢範例。

常見 EKS 使用案例的查詢範例

若要尋找叢集建立者,請搜尋對應至 kubernetes-admin 使用者的 IAM 實體。

查詢:

fields @logStream, @timestamp, @message
| sort @timestamp desc
| filter @logStream like /authenticator/
| filter @message like "username=kubernetes-admin"
| limit 50

範例輸出:

@logStream, @timestamp @message
authenticator-71976 ca11bea5d3083393f7d32dab75b,2021-08-11-10:09:49.020,"time=""2021-08-11T10:09:43Z"" level=info msg=""access granted"" arn=""arn:aws:iam::12345678910:user/awscli"" client=""127.0.0.1:51326"" groups=""[system:masters]"" method=POST path=/authenticate sts=sts.eu-west-1.amazonaws.com uid=""heptio-authenticator-aws:12345678910:ABCDEFGHIJKLMNOP"" username=kubernetes-admin"

在此輸出中,IAM 使用者 arn:aws:iam::12345678910:user/awscli 對應至使用者 kubernetes-admin

若要尋找特定使用者所執行的要求,請搜尋 kubernetes-admin 使用者執行的作業。

查詢範例:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| filter strcontains(user.username,"kubernetes-admin")
| sort @timestamp desc
| limit 50

範例輸出:

@logStream,@timestamp,@message
kube-apiserver-audit-71976ca11bea5d3083393f7d32dab75b,2021-08-11 09:29:13.095,"{...""requestURI"":""/api/v1/namespaces/kube-system/endpoints?limit=500";","string""verb"":""list"",""user"":{""username"":""kubernetes-admin"",""uid"":""heptio-authenticator-aws:12345678910:ABCDEFGHIJKLMNOP"",""groups"":[""system:masters"",""system:authenticated""],""extra"":{""accessKeyId"":[""ABCDEFGHIJKLMNOP""],""arn"":[""arn:aws:iam::12345678910:user/awscli""],""canonicalArn"":[""arn:aws:iam::12345678910:user/awscli""],""sessionName"":[""""]}},""sourceIPs"":[""12.34.56.78""],""userAgent"":""kubectl/v1.22.0 (darwin/amd64) kubernetes/c2b5237"",""objectRef"":{""resource"":""endpoints"",""namespace"":""kube-system"",""apiVersion"":""v1""}...}"

若要尋找特定 userAgent 所發出的 API 呼叫,您可以使用此查詢範例:

fields @logStream, @timestamp, userAgent, verb, requestURI, @message
| filter @logStream like /kube-apiserver-audit/
| filter userAgent like /kubectl\/v1.22.0/
| sort @timestamp desc
| filter verb like /(get)/

縮短的範例輸出:

@logStream,@timestamp,userAgent,verb,requestURI,@message
kube-apiserver-audit-71976ca11bea5d3083393f7d32dab75b,2021-08-11 14:06:47.068,kubectl/v1.22.0 (darwin/amd64) kubernetes/c2b5237,get,/apis/metrics.k8s.io/v1beta1?timeout=32s,"{""kind"":""Event"",""apiVersion"":""audit.k8s.io/v1"",""level"":""Metadata"",""auditID"":""863d9353-61a2-4255-a243-afaeb9183524"",""stage"":""ResponseComplete"",""requestURI"":""/apis/metrics.k8s.io/v1beta1?timeout=32s"",""verb"":""get"",""user"":{""username"":""kubernetes-admin"",""uid"":""heptio-authenticator-aws:12345678910:AIDAUQGC5HFOHXON7M22F"",""groups"":[""system:masters"",""system:authenticated""],""extra"":{""accessKeyId"":[""ABCDEFGHIJKLMNOP""],""arn"":[""arn:aws:iam::12345678910:user/awscli""],""canonicalArn"":[""arn:aws:iam::12345678910:user/awscli""],""sourceIPs"":[""12.34.56.78""],""userAgent"":""kubectl/v1.22.0 (darwin/amd64) kubernetes/c2b5237""...}"

若要尋找對 aws-auth ConfigMap 所執行的變更,您可以使用此範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| filter requestURI like /\/api\/v1\/namespaces\/kube-system\/configmaps/
| filter objectRef.name = "aws-auth"
| filter verb like /(create|delete|patch)/
| sort @timestamp desc
| limit 50

縮短的範例輸出:

@logStream,@timestamp,@message
kube-apiserver-audit-f01c77ed8078a670a2eb63af6f127163,2021-10-27 05:43:01.850,{""kind"":""Event"",""apiVersion"":""audit.k8s.io/v1"",""level"":""RequestResponse"",""auditID"":""8f9a5a16-f115-4bb8-912f-ee2b1d737ff1"",""stage"":""ResponseComplete"",""requestURI"":""/api/v1/namespaces/kube-system/configmaps/aws-auth?timeout=19s"",""verb"":""patch"",""responseStatus"": {""metadata"": {},""code"": 200 },""requestObject"": {""data"": { contents of aws-auth ConfigMap } },""requestReceivedTimestamp"":""2021-10-27T05:43:01.033516Z"",""stageTimestamp"":""2021-10-27T05:43:01.042364Z"" }

若要尋找拒絕的要求,您可以使用此範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^authenticator/
| filter @message like "denied"
| sort @timestamp desc
| limit 50

範例輸出:

@logStream,@timestamp,@message
authenticator-8c0c570ea5676c62c44d98da6189a02b,2021-08-08 20:04:46.282,"time=""2021-08-08T20:04:44Z"" level=warning msg=""access denied"" client=""127.0.0.1:52856"" error=""sts getCallerIdentity failed: error from AWS (expected 200, got 403)"" method=POST path=/authenticate"

若要尋找對 Pod 排程的節點,請查詢 kube-scheduler 日誌。

查詢範例:

fields @logStream, @timestamp, @message
| sort @timestamp desc
| filter @logStream like /kube-scheduler/
| filter @message like "aws-6799fc88d8-jqc2r"
| limit 50

範例輸出:

@logStream,@timestamp,@message
kube-scheduler-bb3ea89d63fd2b9735ba06b144377db6,2021-08-15 12:19:43.000,"I0915 12:19:43.933124       1 scheduler.go:604] ""Successfully bound pod to node"" pod=""kube-system/aws-6799fc88d8-jqc2r"" node=""ip-192-168-66-187.eu-west-1.compute.internal"" evaluatedNodes=3 feasibleNodes=2"

在此範例輸出中,已在節點 ip-192-168-66-187.eu-west-1.compute.internal 上對 Pod aws-6799fc88d8-jqc2r 排程。

若要尋找 Kubernetes API 伺服器要求的 HTTP 5xx 伺服器錯誤,您可以使用此範例查詢:

fields @logStream, @timestamp, responseStatus.code, @message
| filter @logStream like /^kube-apiserver-audit/
| filter responseStatus.code >= 500
| limit 50

縮短的範例輸出:

@logStream,@timestamp,responseStatus.code,@message
kube-apiserver-audit-4d5145b53c40d10c276ad08fa36d1f11,2021-08-04 07:22:06.518,503,"...""requestURI"":""/apis/metrics.k8s.io/v1beta1?timeout=32s"",""verb"":""get"",""user"":{""username"":""system:serviceaccount:kube-system:resourcequota-controller"",""uid"":""36d9c3dd-f1fd-4cae-9266-900d64d6a754"",""groups"":[""system:serviceaccounts"",""system:serviceaccounts:kube-system"",""system:authenticated""]},""sourceIPs"":[""12.34.56.78""],""userAgent"":""kube-controller-manager/v1.21.2 (linux/amd64) kubernetes/d2965f0/system:serviceaccount:kube-system:resourcequota-controller"",""responseStatus"":{""metadata"":{},""code"":503},..."}}"

若要排查 CronJob 啟用問題,請搜尋 cronjob-controller 發出的 API 呼叫。

查詢範例:

fields @logStream, @timestamp, @message
| filter @logStream like /kube-apiserver-audit/
| filter user.username like "system:serviceaccount:kube-system:cronjob-controller"
| display @logStream, @timestamp, @message, objectRef.namespace, objectRef.name
| sort @timestamp desc
| limit 50

縮短的範例輸出:

{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "objectRef": { "resource": "cronjobs", "namespace": "default", "name": "hello", "apiGroup": "batch", "apiVersion": "v1" }, "responseObject": { "kind": "CronJob", "apiVersion": "batch/v1", "spec": { "schedule": "*/1 * * * *" }, "status": { "lastScheduleTime": "2021-08-09T07:19:00Z" } } }

在此範例輸出中,預設命名空間中的 hello 任務會每分鐘執行一次,最後一次排定在 2021-08-09T07:19:00Z

若要尋找 replicaset-controller 發出的 API 呼叫,您可以使用此範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /kube-apiserver-audit/
| filter user.username like "system:serviceaccount:kube-system:replicaset-controller"
| display @logStream, @timestamp, requestURI, verb, user.username
| sort @timestamp desc
| limit 50

範例輸出:

@logStream,@timestamp,requestURI,verb,user.username
kube-apiserver-audit-8c0c570ea5676c62c44d98da6189a02b,2021-08-10 17:13:53.281,/api/v1/namespaces/kube-system/pods,create,system:serviceaccount:kube-system:replicaset-controller
kube-apiserver-audit-4d5145b53c40d10c276ad08fa36d1f11,2021-08-04 0718:44.561,/apis/apps/v1/namespaces/kube-system/replicasets/coredns-6496b6c8b9/status,update,system:serviceaccount:kube-system:replicaset-controller

若要尋找對 Kubernetes 資源所執行的作業,您可以使用此範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| filter verb == "delete" and requestURI like "/api/v1/namespaces/default/pods/my-app"
| sort @timestamp desc
| limit 10

上述範例會在 Pod my-app預設命名空間上查詢 delete (刪除) API 呼叫的篩選條件。

縮短的範例輸出:

@logStream,@timestamp,@message
kube-apiserver-audit-e7b3cb08c0296daf439493a6fc9aff8c,2021-08-11 14:09:47.813,"...""requestURI"":""/api/v1/namespaces/default/pods/my-app"",""verb"":""delete"",""user"":{""username""""kubernetes-admin"",""uid"":""heptio-authenticator-aws:12345678910:ABCDEFGHIJKLMNOP"",""groups"":[""system:masters"",""system:authenticated""],""extra"":{""accessKeyId"":[""ABCDEFGHIJKLMNOP""],""arn"":[""arn:aws:iam::12345678910:user/awscli""],""canonicalArn"":[""arn:aws:iam::12345678910:user/awscli""],""sessionName"":[""""]}},""sourceIPs"":[""12.34.56.78""],""userAgent"":""kubectl/v1.22.0 (darwin/amd64) kubernetes/c2b5237"",""objectRef"":{""resource"":""pods"",""namespace"":""default"",""name"":""my-app"",""apiVersion"":""v1""},""responseStatus"":{""metadata"":{},""code"":200},""requestObject"":{""kind"":""DeleteOptions"",""apiVersion"":""v1"",""propagationPolicy"":""Background""},
..."

若要擷取對 Kubernetes API 伺服器呼叫的 HTTP 回應碼計數,您可以使用此範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| stats count(*) as count by responseStatus.code
| sort count desc

範例輸出:

responseStatus.code,count
200,35066
201,525
403,125
404,116
101,2

若要尋找 kube-system 命名空間中對 DaemonSets/Addons 所執行的變更,您可以使用此範例查詢:

filter @logStream like /^kube-apiserver-audit/
| fields @logStream, @timestamp, @message
| filter verb like /(create|update|delete)/ and strcontains(requestURI,"/apis/apps/v1/namespaces/kube-system/daemonsets")
| sort @timestamp desc
| limit 50

範例輸出:

{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "RequestResponse", "auditID": "93e24148-0aa6-4166-8086-a689b0031612", "stage": "ResponseComplete", "requestURI": "/apis/apps/v1/namespaces/kube-system/daemonsets/aws-node?fieldManager=kubectl-set", "verb": "patch", "user": { "username": "kubernetes-admin", "groups": [ "system:masters", "system:authenticated" ] }, "userAgent": "kubectl/v1.22.2 (darwin/amd64) kubernetes/8b5a191", "objectRef": { "resource": "daemonsets", "namespace": "kube-system", "name": "aws-node", "apiGroup": "apps", "apiVersion": "v1" }, "requestObject": { "REDACTED": "REDACTED" }, "requestReceivedTimestamp": "2021-08-09T08:07:21.868376Z", "stageTimestamp": "2021-08-09T08:07:21.883489Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "" } }

在此範例輸出中,kubernetes-admin 使用者使用 kubectl v1.22.2 來修補 aws-node DaemonSet。

若要尋找刪除節點的用戶,您可以使用以下範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| filter verb == "delete" and requestURI like "/api/v1/nodes"
| sort @timestamp desc
| limit 10

縮短的範例輸出:

@logStream,@timestamp,@message
kube-apiserver-audit-e503271cd443efdbd2050ae8ca0794eb,2022-03-25 07:26:55.661,"{"kind":"Event","verb":"delete","user":{"username":"kubernetes-admin","groups":["system:masters","system:authenticated"],"arn":["arn:aws:iam::1234567890:user/awscli"],"canonicalArn":["arn:aws:iam::1234567890:user/awscli"],"sessionName":[""]}},"sourceIPs":["1.2.3.4"],"userAgent":"kubectl/v1.21.5 (darwin/amd64) kubernetes/c285e78","objectRef":{"resource":"nodes","name":"ip-192-168-37-22.eu-west-1.compute.internal","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Success","code":200},"requestObject":{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Background"},"responseObject":{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Success","details":{"name":"ip-192-168-37-22.eu-west-1.compute.internal","kind":"nodes","uid":"518ba070-154e-4400-883a-77a44a075bd0"}},"requestReceivedTimestamp":"2022-03-25T07:26:55.355378Z",}}"

AWS 官方
AWS 官方已更新 1 年前