How to troubleshoot "Error occurred during operation 'DeleteClusters SDK Error: The Cluster cannot be deleted while Services are active." when deleting ECS cluster from CloudFormation?

4 minute read
Content level: Intermediate
2

This article explains on how to resolve the DeleteClusters SDK Error that occurs when attempting to delete an ECS cluster with active services through CloudFormation. The error appears when there are still running ECS Services within the cluster that need to be stopped before deletion.

Description:

This error occurs when attempting to delete an Amazon Elastic Container Service (Amazon ECS) cluster that still has active ECS Services running within it. Amazon ECS clusters cannot be deleted while they have active services, as this would disrupt the operation of those services and the tasks they are running. To successfully delete an ECS cluster, all services within the cluster must first be stopped and terminated.

Resolution:

This error can occur in couple scenarios, one is where both AWS::ECS::Cluster and AWS::ECS::Service are created via CloudFormation without any dependency between these resources. Another one is where the AWS::ECS::Cluster resource is created via CloudFormation and the ECS services are created outside of CloudFormation.

Troubleshooting steps for each scenario based on the configuration:

  1. Scenario 1: Both AWS::ECS::Cluster and AWS::ECS::Service are created via CloudFormation.
  2. Scenario 2: AWS::ECS::Cluster resource is created via CloudFormation and the ECS service(s) are created outside of CloudFormation.

Scenario 1:

If the ECS service(s) were defined in the same CloudFormation template as the cluster, then this error should not arise if the services explicitly depend on the cluster with a DependsOn attribute or intrinsic function Ref, for example:

Resources:
  MyECSCluster:
    Type: AWS::ECS::Cluster
    Properties:
      ClusterName: MyCluster

  MyECSService:
    Type: AWS::ECS::Service
    DependsOn: MyECSCluster
    Properties:
      ServiceName: MyService
      Cluster: !Ref MyECSCluster

If their deletion order is undefined, CloudFormation may try delete the cluster before deleting the services. If you encounter this error then try to delete the stack again. By the time of the second deletion attempt, the ECS Services and other resources present in the stack may have already been deleted from the first try, allowing the cluster deletion to proceed successfully.

To prevent this issue in future deployments, always ensure proper dependencies are defined in your CloudFormation templates between the ECS cluster and its services, such that CloudFormation first deletes the service(s) and then deletes the cluster.

Scenario 2:

If the services were created outside of the CloudFormation stack that created the ECS Cluster, when trying to delete the stack the AWS::ECS::Cluster resource will fail with this error and the stack will be in DELETE_FAILED status.

To fix this, as mentioned here, you can retry deleting the stack by retaining the AWS::ECS::Cluster resource and cleanup the ECS Cluster and Services manually.

Alternatively, we can also first delete ECS Service(s) from the ECS Console and then retry the stack deletion to delete the AWS::ECS::Cluster resource successfully. To do this please follow the troubleshooting steps mentioned below:

  1. Open the Amazon ECS console.
  2. In the navigation pane, choose Clusters.
  3. Select the cluster that is causing the dependency violation.
  4. In the Services tab, identify the active service(s) associated with the cluster and stop the running tasks in it, to delete the service(s) successfully or when deleting the service(s) select Force Delete option such that ECS automatically deletes both tasks and service(s) at a time.
  5. Once all services are deleted and no longer active, you can go back to the CloudFormation console.
  6. In the CloudFormation console, select the stack and initiate the DELETE stack operation again.

This should allow the CloudFormation stack to successfully delete the ECS cluster resource, as the dependencies have been removed by stopping the associated services.

In both the scenarios the key is to ensure all services are stopped and deleted before attempting to delete the ECS cluster. The main difference is where and how you need to manage these services (within CloudFormation or in the ECS console) based on how they were originally created.


Co-Author: Kirtan Gajjar