How do I turn on Container Insights metrics on an Amazon EKS cluster?

7 minute read
0

I want to configure Amazon CloudWatch Container Insights to see my Amazon Elastic Kubernetes Service (Amazon EKS) cluster metrics.

Resolution

Container Insights is supported only on Linux instances. Amazon provides a CloudWatch agent container image on Amazon Elastic Container Registry (Amazon ECR). For more information, see cloudwatch-agent/cloudwatch-agent.

Before you begin, make sure that you meet the prerequisites for Container Insights in CloudWatch. For AWS Fargate clusters, you must define a Fargate profile to schedule pods. Also, the Amazon EKS Pod Identity and Access Management (IAM) role must allow components that run on the Fargate infrastructure to make calls to AWS APIs. For example, the IAM role must be able to pull container images from Amazon ECR.

Use the CloudWatch agent to set up Container Insights metrics on your EKS cluster

The CloudWatch agent creates a log group that's named aws/containerinsights/Cluster_Name/performance, and then sends the performance log events to the log group. When you set up Container Insights to collect metrics, you must deploy the CloudWatch agent container image as a DaemonSet from Docker Hub. By default, you deploy container image as an anonymous user.

Note: Docker might limit the number of images that you can pull.

To use the CloudWatch agent to set up Container Insights, complete the following steps:

  1. Run the following command to create an amazon-cloudwatch namespace:

    kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cloudwatch-namespace.yaml
  2. Run the following command to create a service account for the CloudWatch agent that's named cloudwatch-agent:

    kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-serviceaccount.yaml
  3. Run the following command to create a configmap as a configuration file for the CloudWatch agent:

    ClusterName= my-cluster-name curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-configmap.yaml | sed 's/cluster_name/'${ClusterName}'/' | kubectl apply -f -

    Note: Replace my-cluster-name with the name of your EKS cluster. For more information, see Create a ConfigMap for the CloudWatch agent.

  4. Run the following command to deploy the cloudwatch-agent DaemonSet:

    kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-daemonset.yaml
  5. (Optional) Run the following command to patch the cloudwatch-agent DaemonSet so that it pulls the CloudWatch agent from the ECR:

    kubectl patch ds cloudwatch-agent -n amazon-cloudwatch -p \ '{"spec":{"template":{"spec":{"containers":[{"name":"cloudwatch-agent","image":"public.ecr.aws/cloudwatch-agent/cloudwatch-agent:latest"}]}}}}'
    

    Note: The CloudWatch agent Docker image on ECR supports the ARM and AMD64 architectures. Replace the latest image tag based on the image version and architecture. For more information, see cloudwatch-agent/cloudwatch-agent.

  6. For IAM roles for service accounts, create an OIDC provider and an IAM role and policy. Then, run the following command to associate the IAM role to the cloudwatch-agent service account:

    kubectl annotate serviceaccounts cloudwatch-agent -n amazon-cloudwatch "eks.amazonaws.com/role-arn=arn:aws:iam::ACCOUNT_ID:role/IAM_ROLE_NAME"

    Note: Replace ACCOUNT_ID with your account ID and IAM_ROLE_NAME with the IAM role that you use for the service accounts.

Troubleshoot the CloudWatch agent

Complete the following steps:

  1. Run the following command to retrieve the list of pods:

    kubectl get pods -n amazon-cloudwatch
  2. Run the following command to check the events at the bottom of the output:

    kubectl describe pod pod-name -n amazon-cloudwatch
  3. Run the following command to check the logs:

    kubectl logs pod-name -n amazon-cloudwatch
    
  4. If you see a CrashLoopBackOff error for the CloudWatch agent, then confirm that you correctly configured your IAM permissions.

Delete the CloudWatch agent

Run the following command to delete the CloudWatch agent:

kubectl delete -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cloudwatch-namespace.yaml

Note: If you delete the namespace, then the associated resources are also deleted.

Use Distro for OpentTelemetry to set up Container Insights metrics on your EKS cluster

AWS Distro for OpenTelemetry creates a log group that's named aws/containerinsights/Cluster_Name/performance, and then sends the performance log events to the log group.

Complete the following steps:

  1. Run the following command to deploy the AWS OpenTelemetry Collector as a DaemonSet:

    curl https://raw.githubusercontent.com/aws-observability/aws-otel-collector/main/deployment-template/eks/otel-container-insights-infra.yaml | kubectl apply -f -
    

    Note: For more information, see Container Insights EKS infrastructure metrics.

  2. Run the following command to confirm that the collector is running:

    kubectl get pods -l name=aws-otel-eks-ci -n aws-otel-eks
    
  3. (Optional) Run the following command to patch aws-otel-eks-ci DaemonSet so that it pulls the aws-otel-collector Docker image on ECR:

    kubectl patch ds aws-otel-eks-ci -n aws-otel-eks -p \'{"spec":{"template":{"spec":{"containers":[{"name":"aws-otel-collector","image":"public.ecr.aws/aws-observability/aws-otel-collector:latest"}]}}}}'

    Note: The Cloudwatch-agent Docker image on ECR supports the ARM and AMD64 architectures. Replace the latest image tag based on the image version and architecture.

  4. (Optional) For IAM roles for service accounts, create an OIDC provider and an IAM role and policy. Then, run the following command to associate the IAM role to the aws-otel-sa service account:

    kubectl annotate serviceaccounts aws-otel-sa -n aws-otel-eks "eks.amazonaws.com/role-arn=arn:aws:iam::ACCOUNT_ID:role/IAM_ROLE_NAME"
    

    Note: Replace ACCOUNT_ID with your account ID and IAM_ROLE_NAME with the IAM role that you use for the service accounts.

Delete Distro for OpenTelemetry

Run the following command to delete Distro for OpenTelemetry:

curl https://raw.githubusercontent.com/aws-observability/aws-otel-collector/main/deployment-template/eks/otel-container-insights-infra.yaml |kubectl delete -f -

Use Distro for OpenTelemetry to set up Container Insights metrics on an EKS Fargate cluster

For applications that run on EKS and Fargate, you can use Distro for OpenTelemetry to set up Container Insights.
The AWS OpenTelemetry Collector sends the following metrics to CloudWatch for every workload that runs on EKS Fargate:

  • pod_cpu_utilization_over_pod_limit
  • pod_cpu_usage_total
  • pod_cpu_limit
  • pod_memory_utilization_over_pod_limit
  • pod_memory_working_set
  • pod_memory_limit
  • pod_network_rx_bytes
  • pod_network_tx_bytes

The AWS OpenTelemetry Collector collects each metric under the CloudWatch namespace that's named ContainerInsights. From the CloudWatch console, choose Metrics, and then choose All metrics. You can navigate to ContainerInsights under Custom Namespace.

Each metric is associated with the following dimension sets:

  • ClusterName, LaunchType
  • ClusterName, Namespace, LaunchType
  • ClusterName, Namespace, PodName, LaunchType

For more information, see Container Insights EKS Fargate.

To deploy Distro for OpenTelemetry in your EKS Fargate, complete the following steps:

  1. Create a namespace that's named fargate-container-insights.

  2. Use the following script to create an IAM role that's named EKS-ADOT-ServiceAccount-Role that's associated with a Kubernetes service account that's named adot-collector. The following helper script requires eksctl:

    #!/bin/bashCLUSTER_NAME=YOUR-EKS-CLUSTER-NAMEREGION=YOUR-EKS-CLUSTER-REGION
    SERVICE_ACCOUNT_NAMESPACE=fargate-container-insights
    SERVICE_ACCOUNT_NAME=adot-collector
    SERVICE_ACCOUNT_IAM_ROLE=EKS-Fargate-ADOT-ServiceAccount-Role
    SERVICE_ACCOUNT_IAM_POLICY=arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
    
    kubectl create ns fargate-container-insights
    
    eksctl utils associate-iam-oidc-provider \
    --cluster=$CLUSTER_NAME \
    --approve
    
    eksctl create iamserviceaccount \
    --cluster=$CLUSTER_NAME \
    --region=$REGION \
    --name=$SERVICE_ACCOUNT_NAME \
    --namespace=$SERVICE_ACCOUNT_NAMESPACE \
    --role-name=$SERVICE_ACCOUNT_IAM_ROLE \
    --attach-policy-arn=$SERVICE_ACCOUNT_IAM_POLICY \
    --approve

    Note: Replace CLUSTER_NAME with your cluster name and REGION with your AWS Region.

  3. Run the following command to deploy the AWS OpenTelemetry Collector as a Kubernetes StatefulSet:

    ClusterName= your-cluster-name Region= your-cluster-region curl https://raw.githubusercontent.com/aws-observability/aws-otel-collector/main/deployment-template/eks/otel-fargate-container-insights.yaml | sed 's/YOUR-EKS-CLUSTER-NAME/'${ClusterName}'/;s/us-east-1/'${Region}'/' | kubectl apply -f -
    

    Note: Replace your-cluster-name with your cluster's name and your-cluster-region with the Region that your cluster is located in. Confirm that you have a matching Fargate profile to provision the StatefulSet pods on Fargate.

  4. Run the following command to verify that the Distro for Open Telemetry Collector pod is running:

    kubectl get pods -n fargate-container-insights
  5. (Optional) Run the following command to patch the adot-collector StatefulSet so that it pulls the aws-otel-collector Docker image on ECR:

    kubectl patch sts adot-collector -n fargate-container-insights -p \'{"spec":{"template":{"spec":{"containers":[{"name":"adot-collector","image":"public.ecr.aws/aws-observability/aws-otel-collector:latest"}]}}}}'
    

Delete Distro for OpenTelemetry

Run the following command to delete Distro for OpenTelemetry:

eksctl delete iamserviceaccount --cluster CLUSTER_NAME --name adot-collector
ClusterName= your-cluster-name Region= your-cluster-region
curl https://raw.githubusercontent.com/aws-observability/aws-otel-collector/main/deployment-template/eks/otel-fargate-container-insights.yaml | sed 's/YOUR-EKS-CLUSTER-NAME/'${ClusterName}'/;s/us-east-1/'${Region}'/' | kubectl delete -f -

Note: Replace your-cluster-name with your cluster name and your-cluster-region with the Region that your cluster is located in.

3 Comments

Does this work in the GovCloud regions, or only commercial?

replied a year ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied a year ago

Hi, I followed the instructions to deploy aws otel collector for Fargate but something is not clear, after the final step(before to uninstall aws-otel-collector) what next? I cant see anything in the CloudWatch Dashboard.

replied 6 months ago