Skip to content

Usage of EKS Node Monitoring Agent for collecting node logs

5 minute read
Content level: Advanced
4

Usually AWS support recommends to run EKS log collector to collect log information to investigate issues. This tool needs to run on the EKS worker node itself. Sometimes it is forbidden for SRE and administrators to log in to the Linux nodes due to compliance or regulatory reasons. EKS Node Monitoring Agent can help in those cases by using the NodeDiagnostic resource.

Background

Usually EKS Node Monitoring Agent (NMA) is primarily considered for node health and repair as described in AWS docs here. EKS Node Monitoring Agent (NMA) is available as an EKS managed add-on and as open source in GitHub repo EKS Node Monitoring Agent.

The NMA agent runs as a DaemonSet pod on EKS worker nodes.

It is little known, that the NMA agent is able to collect node logs without having to log in into the node. The installation includes a Kubernetes (K8s) CustomresourceDefinition (CRD) called NodeDiagnostics. The agent on worker node watches NodeDiagnostics CRD objects which have the same name as the node itself. If such an object is created by an administrator, the agent detects it via the watch and creates a log bundle and pushes it to an S3 bucket using a presigned URL defined in the NodeDiagnostic spec. logCapture.destinationatttribute. The detailed steps are described in AWS docs Retrieve node logs for a managed node using kubectl and S3.

One can follow the creation of the log bundle using kubectl logs and kubectl describe <NodeDiagnostics CRD name> as shown in the following example:

# NMA agent logs
$ kubectl logs -n kube-system eks-node-monitoring-agent-gzsrf -f
...
{"level":"info","ts":"2026-02-18T17:33:16Z","msg":"beginning log collection","hostname":"<redacted>.compute.internal","controller":"node-diagnostic","controllerGroup":"eks.amazonaws.com","controllerKind":"NodeDiagnostic","NodeDiagnostic":{"name":"<redacted>.compute.internal"},"namespace":"","name":"<redacted>.compute.internal","reconcileID":"3cf8a7ae-3d43-4ca1-a966-a4f8ea2f31c6"}
{"level":"info","ts":"2026-02-18T17:33:25Z","msg":"finished log collection","hostname":"<redacted>.compute.internal","controller":"node-diagnostic","controllerGroup":"eks.amazonaws.com","controllerKind":"NodeDiagnostic","NodeDiagnostic":{"name":"<redacted>.compute.internal"},"namespace":"","name":"<redacted>.compute.internal","reconcileID":"3cf8a7ae-3d43-4ca1-a966-a4f8ea2f31c6","issueCount":0}
{"level":"info","ts":"2026-02-18T17:33:25Z","msg":"uploading logs","hostname":"<redacted>.compute.internal","controller":"node-diagnostic","controllerGroup":"eks.amazonaws.com","controllerKind":"NodeDiagnostic","NodeDiagnostic":{"name":"<redacted>.compute.internal"},"namespace":"","name":"<redacted>.compute.internal","reconcileID":"3cf8a7ae-3d43-4ca1-a966-a4f8ea2f31c6","url":"https://<redacted>.s3.amazonaws.com/2026-02-18/<redacted>.compute.internal.log.tar.gz?AWSAccessKeyId=AKIAZAC4CGT74VSNVCUG&Signature=FMh0tmzcXXDjoQc4FkRmqXqjnY4%3D&Expires=1771436760"}
{"level":"info","ts":"2026-02-18T17:33:25Z","msg":"upload completed successfully","hostname":"<redacted>.compute.internal","controller":"node-diagnostic","controllerGroup":"eks.amazonaws.com","controllerKind":"NodeDiagnostic","NodeDiagnostic":{"name":"<redacted>.compute.internal"},"namespace":"","name":"<redacted>.compute.internal","reconcileID":"3cf8a7ae-3d43-4ca1-a966-a4f8ea2f31c6"}

# NodeDiagnostic K8s object
$ kubectl get nodediagnostic.eks.amazonaws.com/ip-<readcted>.compute.internal -o yaml
apiVersion: eks.amazonaws.com/v1alpha1
kind: NodeDiagnostic
metadata:
...
  name: ip-<redacted>.compute.internal
...
spec:
  logCapture:
    categories:
    - All
    destination: https://<redacted>.s3.amazonaws.com/2026-02-18/ip-<readcted>.compute.internal.log.tar.gz?AWSAccessKeyId=<redacted>&Signature=<readcted>&Expires=1771436760
status:
  captureStatuses:
  - state:
      completed:
        finishedAt: "2026-02-18T17:33:25Z"
        message: successfully uploaded logs with no errors
        reason: Success
        startedAt: "2026-02-18T17:33:16Z"
    type: Log

Update 16 Mar, 2026

EKS add-on eks-node-monitoring-agent version v1.6.1-eksbuild.1 received an important new feature, which greatly simplifies log collection, see Pull Requests feat: add "node" destination for NodeDiagnostic #58.

The nodediagnostics.eks.amazonaws.com CRD now contains node as a valid spec.logCapture.destination. It uses the kubelet node/proxy API, described in upstream K8s blog post Kubernetes 1.27: Query Node Logs Using The Kubelet API.

Using it one can store and download the collected logs directly from the node without using S3.

Note: The logs will be stored for just 10min on the node, see code here!

Example usage:

$ kubectl explain nodediagnostics.spec.logCapture.destination
GROUP:      eks.amazonaws.com
KIND:       NodeDiagnostic
VERSION:    v1alpha1

FIELD: destination <string>


DESCRIPTION:
    UploadDestination is a URL describing where to deliver a diagnostic
    artifact.
    This can be set to "node" to temporarily store logs on the node for later
    collection.

Create a NodeDiagnostic resource for the node (node name is <redacted>)

$ kubectl apply -f - <<EOF
apiVersion: eks.amazonaws.com/v1alpha1
kind: NodeDiagnostic
metadata:
    name: <redacted>.eu-west-1.compute.internal
spec:
    logCapture:
        destination: node
EOF
nodediagnostic.eks.amazonaws.com/<redacted>.eu-west-1.compute.internal created

$ kubectl get nodediagnostic.eks.amazonaws.com
NAME                                           AGE
<redacted>.eu-west-1.compute.internal   8s

$ kubectl get nodediagnostic.eks.amazonaws.com <redacted>.eu-west-1.compute.internal  -o yaml
apiVersion: eks.amazonaws.com/v1alpha1
kind: NodeDiagnostic
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"eks.amazonaws.com/v1alpha1","kind":"NodeDiagnostic","metadata":{"annotations":{},"name":"<redacted>.eu-west-1.compute.internal"},"spec":{"logCapture":{"destination":"node"}}}
  creationTimestamp: "2026-03-16T06:23:42Z"
  generation: 1
  name: <redacted>.eu-west-1.compute.internal
  resourceVersion: "32176151"
  uid: e48ce23f-cbf1-49fe-9272-041b229ec6e3
spec:
  logCapture:
    categories:
    - All
    destination: node
status:
  captureStatuses:
  - state:
      completed:
        finishedAt: "2026-03-16T06:23:58Z"
        message: successfully saved logs to /host/var/log/support/<redacted>.compute.internal-logs.tar.gz
        reason: Success
        startedAt: "2026-03-16T06:23:42Z"
    type: Log

The status in the above output shows, that the logs were successfully saved to /host/var/log/support/<redacted>.compute.internal-logs.tar.gz.

Now use the "Node log query" API to fetch the logs:

$ kubectl get --raw "/api/v1/nodes/<redacted>.eu-west-1.compute.internal/proxy/logs/support/<redacted>.eu-west-1.compute.internal-logs.tar.gz" > <redacted>.eu-west-1.compute.internal-logs.tar.gz

Update 19 Mar, 2026

To further simplify log collection, a kubectl plugin was made available in the official NMA GitHub repo as part of the tools section kubectl-ekslogs.

After installation collecting the logs is as easy as:

$ kubectl ekslogs <redacted>.eu-west-1.compute.internal
✔ Transfer mode: node proxy API
⟳ Validating node(s)...
✔ All 1 node(s) validated
⟳ Creating NodeDiagnostic resources...
⟳ Waiting for log collection to complete (timeout: 300s)...
✔ Log collection completed on all nodes
⟳ Downloading log bundles...
✔ Saved: ./<redacted>.eu-west-1.compute.internal-logs.tar.gz (5,1M)

✔ Done — 1 log bundle(s) downloaded to ./

EKS Auto Mode

On EKS Auto Mode nodes the NMA is pre-installed as part of the EKS Auto Mode AMI (Amazon Machine Image) and running as a systemd service. The CRD NodeDiagnostic is available in an EKS Auto Mode cluster as well by default.

Note: EKS Auto Mode uses resource based naming (rbn) for K8s worker nodes!

Update 19 Mar, 2026

Starting with Auto Mode AMI 2026.3.11 the build-in NMA version now supports node as a valid spec.logCapture.destination in the NodeDiagnostics resource as well, allowing downloads using the kubelet node/proxy feature!

$ kubectl get no i-<redacted> -o wide
NAME                  STATUS   ROLES    AGE     VERSION               INTERNAL-IP       EXTERNAL-IP   OS-IMAGE                                                              KERNEL-VERSION   CONTAINER-RUNTIME
i-<redacted>   Ready    <none>   7h37m   v1.35.0-eks-ac2d5a0   192.168.152.126   <none>        Bottlerocket (EKS Auto, Standard) 2026.3.11 (aws-k8s-1.35-standard)   6.12.73          containerd://2.1.6+bottlerocket

$ kubectl ekslogs i-<redacted>
✔ Transfer mode: node proxy API
⟳ Validating node(s)...
✔ All 1 node(s) validated
⟳ Creating NodeDiagnostic resources...
⟳ Waiting for log collection to complete (timeout: 300s)...
✔ Log collection completed on all nodes
⟳ Downloading log bundles...
✔ Saved: ./i-<redacted>-logs.tar.gz (2,0M)

✔ Done — 1 log bundle(s) downloaded to ./
AWS
EXPERT
published 2 months ago731 views