Are there capacity issues for g4dn instances in eu-west-1?

0

Hi there,
I'm building an application that as needed, autoscales a group of g4dn.2xlarge instances to act as GPU workers. Over the last week I've noticed that sometimes it can take over 20 minutes to bring up new instances, even if I'm only requesting 1 or 2. Whilst waiting I get error messages of the following form.

Launching a new EC2 instance. Status Reason: We currently do not have sufficient g4dn.2xlarge capacity in the Availability Zone you requested (eu-west-1a). Our system will be working on provisioning additional capacity. You can currently get g4dn.2xlarge capacity by not specifying an Availability Zone in your request or choosing eu-west-1b, eu-west-1c. Launching EC2 instance failed.

I have not yet looked at making my application work over multiple availability zones as it's still at an early stage and I assume that the networking will be more complicated, but I can see that I need to start giving it some thought. Before I do that though is it possible to get some insight into what is causing these capacity issues? Is there a problem specific to eu-west-1a or will I see similar issues if I try to launch in the other availability zones as well?

Thanks in advance.

Alan

abroun
asked 2 years ago1146 views
2 Answers
0

Hi abroun

If you get this error when you try to launch an instance or restart a stopped instance, AWS does not currently have enough available On-Demand capacity to fulfill your request.

Solution:
To resolve the issue, try the following:

  • Wait a few minutes and then submit your request again; capacity can shift frequently.

  • Submit a new request with a reduced number of instances. For example, if you're making a single request to launch 15 instances, try making 3 requests for 5 instances, or 15 requests for 1 instance instead.

  • If you're launching an instance, submit a new request without specifying an Availability Zone.

  • If you're launching an instance, submit a new request using a different instance type (which you can resize at a later stage). For more information, see [1]

  • If you are launching instances into a cluster placement group, you can get an insufficient capacity error. For more information, see [2]

If the preceding troubleshooting steps don't resolve the problem, then you can move the instance to another VPC or to another subnet and Availability Zone [3].

To avoid insufficient capacity errors on critical machines, consider using On-Demand Capacity Reservations. To use an On-Demand Capacity Reservation, do the following:

  1. Create the Capacity Reservation[4] in an Availability Zone.
  2. Launch critical instances into your Capacity Reservation[5]. You can view real-time Capacity Reservation usage, and launch instances into it as needed.

I hope this helps

References:

[1] Change instance type:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-resize.html

[2] Placement group rules and limitations:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html#concepts-placement-groups

[3] Move EC2 to another Subnet, Availability Zone, or VPC:
https://aws.amazon.com/premiumsupport/knowledge-center/move-ec2-instance/

[4] Capacity Reservation:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-capacity-reservations.html

[5] Launch critical instances into your Capacity Reservation:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/capacity-reservations-using.html#capacity-reservations-launch

[6] How do I troubleshoot InsufficientInstanceCapacity errors when starting or launching an EC2 instance:
https://aws.amazon.com/premiumsupport/knowledge-center/ec2-insufficient-capacity-errors/

answered 2 years ago
0

Hi Mabandla,

Thank you for the detailed reply. In the first instance I think that I will look at making my application agnostic to the availability zone in which the g4dn instances are running.

I was curious though if you knew if there was anything specific going on with gpu instances in eu-west-1? Over the last couple of days I've had on-demand requests for just 1 g4dn.2xlarge instance take 20-30 minutes to come up multiple times when in the past for single instances I'm sure it's usually only taken a couple of minutes max. I haven't tried any other g4dn instances yet such as the smaller g4dn.xlarge to see if similar issues occur there.

Regards

Alan

Edited by: abroun on Nov 16, 2021 8:58 AM

abroun
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions