SAGEMAKER - ResourceLimitExceeded

0

ResourceLimitExceeded: An error occurred (ResourceLimitExceeded) when calling the CreateEndpoint operation: The account-level service limit 'Memory size in MB per serverless endpoint' is 3072 MBs, with current utilization of 0 MBs and a request delta of 6144 MBs. Please use AWS Service Quotas to request an increase for this quota. If AWS Service Quotas is not available, contact AWS support to request an increase for this quota.

I don't understand the path in quotas to follow to increase "MemorySizeInByte" for SageMaker servless deployment.

Dario
已提問 9 個月前檢視次數 525 次
1 個回答
1
  1. Go to service limit request quota page
  2. For Limit Type, Search and Select SageMaker Endpoints
  3. Make sure that region is same as the region that is displayed on the top right corner of your amazon console.
  4. Provide a description and detail about error message
  5. Click Submit

You'll get a response within 24-48 hours from the time you log support case for limit increase.

Note: Without looking further into more details for this error message in your account, it'd be little hard to find what's exactly blocking this, closest service limit listed in drop down is Sagemaker Endpoints and so I'm suggesting here.

Also, if you have support plan where you can create support case, I'd suggest you to log a support case under Technical category, as they would be in better position to tell exactly what's going on and what are those limits which are blocking this.

Hope you find this information helpful.

Comment here if you have additional questions, happy to help.

Abhishek

profile pictureAWS
專家
已回答 9 個月前
  • transformers_version='4.12.3' pytorch_version='1.9.1' py_version='py38' region = SESSION.boto_region_name

    image_uri = sagemaker.image_uris.retrieve( framework='huggingface', base_framework_version=f'pytorch{pytorch_version}', region=region, version=transformers_version, py_version=py_version, instance_type='ml.m5.large', # No GPU support on serverless inference image_scope='inference' )

    huggingface_model = HuggingFaceModel( model_data=model_s3_uri, # path to your trained sagemaker model role=role, # iam role with permissions to create an Endpoint
    py_version="py38", # python version of the DLC
    image_uri=image_uri, # sagemaker container image uri env={ "MMS_MAX_RESPONSE_SIZE": "20000000", "SAGEMAKER_CONTAINER_LOG_LEVEL": "20", "SAGEMAKER_PROGRAM": "inference.py", "SAGEMAKER_REGION": "us-east-1", "SAGEMAKER_SUBMIT_DIRECTORY": "/opt/ml/model/code" } )

    from sagemaker.serverless import ServerlessInferenceConfig serverless_config = ServerlessInferenceConfig( memory_size_in_mb=6019, max_concurrency=8, )

    WHERE THE ERROR OCCURS:

    predictor = huggingface_model.deploy( serverless_inference_config=serverless_config )

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南