Skip to content

How do I troubleshoot failures when I create an ElastiCache cluster?

6 minute read
1

When I create an Amazon ElastiCache cluster, the creation fails.

Short description

Based on whether the cluster is a self-designed or an Amazon ElastiCache Serverless cluster, cluster creation might fail for the following reasons:

  • You restore a backup from Amazon Simple Storage Service (Amazon S3), but the restore fails with an error.
  • There isn't enough capacity for the requested cache node type in an Availability Zone or AWS Region.
  • You selected a cache node type that's not supported in a specific Availability Zone of the Region.
  • There isn't enough free IP addresses in the subnet that you used for cache cluster creation.
  • ElastiCache can't access the AWS Key Management Service (AWS KMS) customer managed key that you used to encrypt a replication group.
  • Your cache doesn't have permission to create a virtual private cloud (VPC) endpoint for ElastiCache Serverless.
  • The AWS Identity and Access Management (IAM) user or role doesn't have the correct permissions.
  • Your AWS account requires a higher service quota.
  • You misconfigured the engine specific parameters during cluster creation.

Resolution

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.

Your Amazon S3 backup restoration fails with an error

The restoration of a backup from Amazon S3 can fail for several reasons. For example, ElastiCache can't retrieve the file, or the bucket is in another Region. To troubleshoot this issue, see How do I troubleshoot the "Create-failed" or "Permission denied" error that occurs when I try to restore my ElastiCache cluster from S3?

There isn't enough capacity for the requested cache node type in an Availability Zone or Region

If AWS doesn't have enough available on-demand capacity, then you might get the following error message when you create a cluster:

"Failed to create cache node because requested AZ does not have sufficient capacity. Please try again with another AZ."

The capacity in an Availability Zone or Region constantly changes. To resolve this issue, create the cluster at a different time. For more information, see Error Messages: InsufficientCacheClusterCapacity.

An Availability Zone of the Region doesn't support cache node type

Some Availability Zones in a Region don't support specific cache node types. If you create a cluster and select one of these Availability zones, then you get the following error message:

"Cache node type is not currently supported in the AZ. Retry the launch with no availability zone or different AZs"

To check the availability of a specific cache node type, use the Amazon Elastic Compute Cloud (Amazon EC2) describe-instance-type-offerings AWS CLI command:

aws ec2 describe-instance-type-offerings --location-type availability-zone --filters Name=instance-type,Values=m5.large --region example-region --output table

Note: Replace example-region with your Region.

There isn't enough free IP addresses in the subnet you chose

If the subnets that you use for your ElastiCache cluster don't have enough free IP addresses, then you might get the following error message:

"Failed to create Cache Cluster due to insufficient Elastic Network Interface or free IP address"

To resolve this issue, identify the subnet group that you configured for the ElastiCache cluster. Then, check each subnet to verify that there are enough free IP addresses. For more information, see How do I troubleshoot insufficient IP address errors that occur during scaling activity in my Amazon VPC?

To free up IP addresses, you can also delete unused elastic network interfaces in a subnet. Or, add subnets to the subnet group in the required Availability Zone to allocate more free IP addresses.

ElastiCache can't access the AWS KMS customer managed key

With ElastiCache at-rest encryption, you can use a default service-managed encryption at rest or your own symmetric customer managed AWS KMS key. If you deleted, turned off, or revoked grants for the KMS key when you created your cluster, then you might get the following error message:

"Failed to create instance test-cluster due to error accessing AWS Key Management Service (KMS) for Customer Master Key arn:aws:kms:us-east-1:123456:key/1abcd2"

It's not a best practice to delete, turn off, or revoke grants for the AWS KMS key that you used to encrypt a replication group. AWS KMS deletes root keys only after a waiting period of at least 7 days. During the waiting period, you can cancel the scheduled deletion. If you delete the AWS KMS key, then you can't recover the cache.

Your ElastiCache Serverless cache doesn't have permission to create a VPC endpoint

When you create a new ElastiCache Serverless cluster, ElastiCache creates VPC endpoints in the selected subnets of your VPC. Your applications use the VPC endpoints to connect to the cache. If your cache can't create the VPC endpoints, then you might have permission issues. To troubleshoot permission issues, see How do I resolve ElastiCache Serverless cluster creation issues?

Your IAM user or role doesn't have the correct permissions

If your IAM user or role doesn't have the correct permissions, then you might get the following error message:

"An error occurred (AccessDenied) when calling the CreateReplicationGroup operation: User: arn:aws:sts::xxxxxxxxx:assumed-role/Hello123 is not authorized to perform: elasticache:CreateReplicationGroup on resource: arn:aws:elasticache:ap-southeast-2:xxxxxxxxxx:replicationgroup:ROLEA because no identity-based policy allows the elasticache:CreateReplicationGroup action".

When you use a custom-managed IAM policy with ElastiCache, take one of the following actions:

Your AWS account requires a higher service quota

Your account has default quotas for each AWS service, and each quota is specific to your Region.

If your account doesn't have the required service quota, then you might get one of the following error messages:

  • "Cache subnet group quota exceeded. You can have at most 500 cache subnet groups in this region. If you need more, please visit the Support Center and open a Service Limit Increase case."
  • "Customer node quota exceeded. You can have at most 1250 nodes in this region. If you need more, please visit the Support Center and open a Service Limit Increase case."

You can request increases for some quotas. For information about quotas and how to increase your quotas, see Quotas for ElastiCache.

You misconfigured your engine specific parameters

If you didn't a specify a parameter group for your engine version, then you might get the following error message:

"An error occurred (InvalidParameterCombination) when calling the CreateReplicationGroup operation: Expected a parameter group of family redis7 but found one of family redis6.x. User has to verify that Parameter Group used has engine version that matches the cluster that is created."

When you create an ElastiCache cluster, make sure that the correct parameter group matches the engine version and cluster mode type.

Related information

How ElastiCache works

AWS OFFICIALUpdated a year ago
2 Comments

If you are facing failures when creating an Amazon ElastiCache cluster, start by checking your IAM permissions to ensure your user or role has the required access, such as elasticache:CreateCacheCluster. Next, verify your VPC and subnet settings to confirm that there are available IPs and proper security group rules allowing traffic on Redis (TCP 6379) or Memcached (TCP 11211) ports. Also, ensure you select a supported instance type that is available in your AWS region. If you encounter errors like "InsufficientCacheClusterCapacity", it may indicate AWS resource limits—check your AWS Service Quotas and request an increase if needed. Additionally, review the ElastiCache Event Logs in the AWS Console to identify specific failure reasons. Some failures may be due to reaching maximum node limits in your region, which also requires a quota adjustment. Addressing these factors will help troubleshoot and successfully create your ElastiCache cluster.

replied a year ago

Troubleshooting failures when creating an Amazon ElastiCache cluster requires checking various factors that may cause deployment issues. Below are key areas to investigate:

  1. Check IAM Permissions Ensure that your IAM user or role has the necessary permissions to create an ElastiCache cluster. Policies should include actions like elasticache:CreateCacheCluster, elasticache:DescribeCacheClusters, and elasticache:AuthorizeSecurityGroupIngress.

  2. Verify VPC and Subnet Settings If your cluster is in a VPC, confirm that:

The selected subnets exist and are correctly configured.

The security group allows inbound and outbound traffic for the necessary ports (e.g., port 6379 for Redis and port 11211 for Memcached).

The network ACLs and route tables allow communication within the VPC.

  1. Check Node Type and Availability Zone (AZ) Compatibility Some node types may not be available in certain AWS Regions.

Ensure the chosen Availability Zones support the selected instance type.

  1. Verify Parameter Group and Subnet Group The parameter group should match the engine version (Redis or Memcached).

The subnet group should include valid subnets for the selected VPC.

  1. Examine Quotas and Limits Check if you have exceeded your AWS account limits, such as:

Maximum number of clusters per region.

Maximum nodes per cluster.

  1. Review AWS CloudTrail and CloudWatch Logs CloudTrail can show permission-related failures.

CloudWatch Logs can provide error messages regarding cluster creation

replied a year ago