Unable to use a GPU instance in Sagemaker

0

Hey everyone, I'm trying to train a model on sagemaker and wanted to use a GPU instance with it. I can confirm that I am not on free tier, I am not able to see the P3 or P5 instances in the list. Also in my Image selection I am not able to find any tensorflow image which can be used. Any help would be appreciated. I also tried using a G5 instance but this is the error I get when I try to run the Notebook instance after selecting it (Third Image).

Instances Images I see Error I receive

Wahaj11
asked 5 months ago862 views
2 Answers
1

There is a couple of things to check here:

G/P instance quota availability

Certain instances like G5, P4 needs a quota increase in AWS console as they're not enabled by default in your account. Probably the error show is refered to something like this:

ResourceLimitExceeded: The account-level service limit 'Studio KernelGateway Apps running on ml.g5.xlarge instance' is 0 Apps, with current utilization of 0 Apps and a request delta of 1 Apps. Please use AWS Service Quotas to request an increase for this quota. If AWS Service Quotas is not available, contact AWS support to request an increase for this quota

You can check your EC2 quotas under Service Quotas in AWS console by searching for instance families like G (Filter: "Running On-Demand G") or P (Filter: "Running On-Demand P").

EC2 Instance type available in region

On the other hand, if instance type doesn't appear in list probably means that is not available in chosen region where you have deployed SageMaker Studio. You can take a look to available instances per region for On-Demand Plans for Amazon EC2 and verify that instance type.

*If you find this useful and solves your question, please remember to accept anwer.

AWS
avelizf
answered 5 months ago
  • This response was helpful: increasing the quota enabled me to run a g4dn.xlarge instance from within EC2.

    Unfortunately, SageMaker still gives the same error message: "Unable to complete operation. Please try again."

    I checked the execution role and attached AmazonEC2FullAccess and AmazonSageMakerFullAccess permissions without success.

    Note that the SageMaker instance is accessed via Identity Center.

    I spent a lot of time on this already and run out of ideas. Can you think of another thing?

0

If you're looking for the g4dn.xlarge instance, the quota you want to increase is actually for Sagemaker, not for EC2. Studio JupyterLab Apps running on ml.g4dn.xlarge instances

rubi242
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions