My Amazon Elastic Kubernetes Service (Amazon EKS) Anywhere cluster failed the creation process, and I want to manually clean up my resources.
Short description
When you try to create an Amazon EKS Anywhere cluster, the cluster creation might fail for various reasons. An unfinished cluster creation might keep some unwanted resources on your machine. If you can't use eksctl to remove these resources, then you can remove them manually.
When you create an EKS Anywhere cluster, the process also creates a bootstrap cluster on the administrative machine. This bootstrap cluster is a Kubernetes in Docker (KinD) cluster, and it facilitates the creation of the EKS Anywhere cluster. To clean up this KinD cluster, stop the KinD containers and remove the KinD container images. This workflow successfully cleans up resources that use Docker as the provider. For other providers, you must complete additional steps on the virtual machines (VMs) for the control plane and nodes.
Resolution
Clean up resources on the administrative machine (for Docker)
To clean up resources that remain on the administrative machine, use the following script for all use cases. If you use Docker as your provider, then this step successfully removes all unwanted resources that remain from the failed cluster creation.
Because the kind delete cluster command requires KinD installation, this script doesn't use the command. Instead, EKS Anywhere uses KinD binaries from temporary containers to setup clusters:
EKSA_CLUSTER_NAME="YOUR_CLUSTER_NAME"
# Clean up KIND Cluster Containers
kind_container_ids=$(docker ps -a | grep "${EKSA_CLUSTER_NAME}" | awk -F ' ' '{print $1}')
for container_id in $kind_container_ids; do echo "deleting container with id ${container_id}"; docker rm -f ${container_id}; done
# Clean up EKS-A tools Containers
tools_container_ids=$(docker ps -a | grep "public.ecr.aws/eks-anywhere/cli-tools" | awk -F ' ' '{print $1}')
for container_id in $tools_container_ids; do echo "deleting container with id ${container_id}"; docker rm -f ${container_id}; done
# Delete All EKS-Anywhere Images
image_ids=$(docker image ls | grep "public.ecr.aws/eks-anywhere/" | awk -F ' ' '{print $3}')
for image_id in $image_ids; do echo "deleting image with id ${image_id}"; docker image rm ${image_id}; done
# Delete Auto-generated Cluster Folder
rm $EKSA_CLUSTER_NAME -rf
Note: Replace YOUR_CLUSTER_NAME with your EKS Anywhere cluster name.
Steps for Bare Metal, Nutanix, CloudStack and vSphere
EKS Anywhere uses Kubernetes Cluster API to provision the components of the Kubernetes cluster. If any VMs are created during the creation process and creation fails, then you must clean up these VMs manually.
You can create and manage EKS Anywhere clusters with either a separate management cluster or without a management cluster. Bare metal clusters don't support a separate management cluster. If you use Bare metal clusters, then see the Cluster without management cluster section.
To clean up your VMs, follow the relevant steps depending on your cluster setup:
Clusters without a management cluster
For clusters without a separate management cluster, power off and delete all worker nodes, etcd VMs, and the API server.
Note: VMs that are associated with Nutanix, CloudStack, and vSphere clusters commonly have their names prefixed with the cluster name.
For Bare metal clusters, power off and delete the target machines.
Clusters with a management cluster
When you use a management cluster, a separate cluster monitors your workload cluster's state. If a machine that's part of the workload cluster powers off and terminates, then the management cluster detects a health issue. Then, the cluster spins up a new virtual machine to bring the workload cluster back to the desired state.
Therefore, to clean up clusters with separate management clusters, delete the Custom Resources (CRDs) that represent the workload cluster from the management cluster. This deletes all the VMs for the particular workload cluster.
Note: In the following commands, replace WORKLOAD_CLUSTER_NAME with the workload cluster that you want to delete.
Replace MANAGEMENT_CLUSTER_FOLDER with the folder that EKS Anywhere created for the management cluster.
Replace MANAGEMENT_CLUSTER_KUBECONFIG_FILE with the kubeconfig file that the service generated for the management cluster. The kubeconfig file is in the MANAGEMENT_CLUSTER_FOLDER.
Delete the Cluster API resource for the workload cluster:
kubectl delete clusters.cluster.x-k8s.io -n eksa-system WORKLOAD_CLUSTER_NAME --kubeconfig MANAGEMENT_CLUSTER_FOLDER/MANAGEMENT_CLUSTER_KUBECONFIG_FILE
Delete the clusters.anywhere.eks.amazonaws.com resource for the cluster, if it exists:
kubectl delete clusters.anywhere.eks.amazonaws.com WORKLOAD_CLUSTER_NAME --kubeconfig MANAGEMENT_CLUSTER_FOLDER/MANAGEMENT_CLUSTER_KUBECONFIG_FILE
Note: If the cluster creation failed before the clusters.anywhere.eks.amazonaws.com resource was provisioned, then you get the following error:
"Error from server (NotFound): clusters.anywhere.eks.amazonaws.com "WORKLOAD_CLUSTER_NAME" not found"