This article helps you understand and resolve the SageMaker AI endpoint deployment error that occurs when the requested instance type isn’t available in enough Availability Zones overlapping with your configured subnets
Short description
When deploying SageMaker AI endpoints, you might encounter the error "Unable to locate at least 2 availability zone(s) with the requested instance type that overlap with SageMaker subnets". This article explains why this occurs and provides detailed steps to diagnose and resolve the issue.
Overview
Amazon SageMaker AI enforces high availability by deploying resources across at least two Availability Zones (AZs). You might see this error in two common scenarios:
- When deploying SageMaker endpoints with insufficient subnet configuration
- When the requested instance type isn't available in enough Availability Zones (AZs) that overlap with your configured subnets
This can affect any instance type, though it's more common with high demand instance types like GPU based instances (for example, ml.g5., ml.g6e., etc) and specialized compute instances.
Root Causes
The error occurs due to one or more of the following reasons:
1. Insufficient subnet and AZ Coverage: SageMaker requires at least two subnets in different AZs for high availability, even when deploying a single instance. If there are too few subnets in unique AZs, attach at least two or more subnets in distinct AZs to your SageMaker model's VPC configuration. It's recommended to create subnets in all the available AZs for your region.
2. Unsupported instance type: The instance type isn’t available in all your selected AZs (for example, ml.g6e.12xlarge may only be supported in eu-central-1a and eu-central-1c).
3. Temporary Capacity Constraints: Even if an instance type is supported in the AZ, temporary capacity shortages can cause this error, especially for large GPU families (ml.g5, ml.g6e, ml.p4d, etc.).
4. Subnets map to the same physical AZ: AZ names (e.g., 1a, 1b) are account-specific. Two different names might still map to the same underlying AZ ID (e.g., both euc1-az1).
Resolution
Step 1: Verify Instance Type Availability
Use the AWS CLI to check instance type availability across AZs:
aws ec2 describe-instance-type-offerings --location-type availability-zone --filters Name=instance-type,Values=<instance-type> --region <region> --output table
Example output:
----------------------------------------------------------
| DescribeInstanceTypeOfferings |
+--------------------------------------------------------+
|| InstanceTypeOfferings ||
|+---------------+-----------------+---------------------+|
|| InstanceType | Location | LocationType ||
|+---------------+-----------------+---------------------+|
|| g6e.xlarge | eu-central-1a | availability-zone ||
|| g6e.xlarge | eu-central-1c | availability-zone ||
+--------------------------------------------------------+
Step 2. Confirm your subnet placement
Run this command to confirm which physical AZ IDs your subnets map to (replace with your subnet id):
aws ec2 describe-subnets \
--subnet-ids subnet-abcd subnet-efgh \
--query 'Subnets[].{SubnetId:SubnetId,AZ:AvailabilityZone,AZId:AvailabilityZoneId}'
If two subnets have the same AvailabilityZoneId, they map to the same physical zone even if their names differ. Choose or create subnets in different AZ IDs to meet SageMaker’s multi-AZ requirement.
Step 3: Configure Multiple Subnets
- Ensure your VPC has subnets in at least two different AZs.
- Best practice is to create subnets in all the available AZs within your region.
- Update your SageMaker AI model’s VPC configuration to use these subnets.
Example configuration:
"VpcConfig": {
"Subnets": [
"subnet-0abc123 (eu-central-1a)",
"subnet-0def456 (eu-central-1c)",
"subnet-0ghi789 (eu-central-1b)"
],
"SecurityGroupIds": ["sg-0123456789abcdef"]
}
Step 4: Verify Resource Quotas
Check your service quotas for the required instance type and request quota increases if needed. Ensure quotas are approved before deployment.
See Requesting a quota increase.
Additional Recommendations
If the issue persists after completing the above steps, try the following options:
1. Retry deployment later:
GPU-based instances may face transient capacity shortages. As instance availability can vary throughout the day, retry the deployment periodically to increase success probability.
2. Consider Alternative Instance Types:
If the instance type isn’t available in multiple AZs, choose a comparable alternative such as:
ml.g5.12xlarge → ml.g6.12xlarge
ml.g6.24xlarge → ml.g6e.24xlarge
See SageMaker supported Instance types
3. Consider other Regions:
Some Regions may have better capacity for large GPU instances. You can check other Regions using the same describe-instance-type-offerings command.
4. (Optional) Use On-Demand Capacity Reservations (ODCR):
For long-running production workloads, consider reserving capacity to guarantee instance availability. See On-Demand Capacity Reservations.
Following these steps ensures a reliable, multi-AZ deployment and prevents this common SageMaker endpoint and model deployment failure.
Related Resources