Sagemaker endpoint endlessly "Creating" after Huggingface fetch error

0

I attempted to create an endpoint using the huggingface features of the sagemaker package for Python

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()
# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'MY_ACC/MY_MODEL_NAME',
	'HF_TASK':'text-generation'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	transformers_version='4.6.1',
	pytorch_version='1.7.1',
	py_version='py36',
	env=hub,
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1, # number of instances
	instance_type='ml.m5.xlarge' # ec2 instance type
)

predictor.predict({
	'inputs': "Can you please let us know more details about your "
})

MY_ACC/MY_MODEL_NAME is a private model, so the endpoint creation kept outputting the following errors:

This is an experimental beta features, which allows downloading model from the Hugging Face Hub on start up. It loads the model defined in the env var `HF_MODEL_ID`
Traceback (most recent call last):
  File "/usr/local/bin/dockerd-entrypoint.py", line 23, in <module>
    serving.main()
  File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/serving.py", line 34, in main
    _start_mms()
  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 49, in wrapped_f
    return Retrying(*dargs, **dkw).call(f, *args, **kw)
  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 206, in call
    return attempt.get(self._wrap_exception)
  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 247, in get
    six.reraise(self.value[0], self.value[1], self.value[2])
  File "/opt/conda/lib/python3.6/site-packages/six.py", line 719, in reraise
    raise value
  File "/opt/conda/lib/python3.6/site-packages/retrying.py", line 200, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/serving.py", line 30, in _start_mms
    mms_model_server.start_model_server(handler_service=HANDLER_SERVICE)
  File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/mms_model_server.py", line 75, in start_model_server
    use_auth_token=HF_API_TOKEN,
  File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/transformers_utils.py", line 154, in _load_model_from_hub
    model_info = _api.model_info(repo_id=model_id, revision=revision, token=use_auth_token)
  File "/opt/conda/lib/python3.6/site-packages/huggingface_hub/hf_api.py", line 155, in model_info
    r.raise_for_status()
  File "/opt/conda/lib/python3.6/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/MY_ACC/MY_MODEL_NAME

It seems like there is a never-ending loop of checking for this resource that it cannot find. I have killed the Python process that started the endpoint creation, but it has carried on regardless.

How do I fix this? I just want to delete the endpoint, but that option is greyed-out as it is still in the creation phase.

Thanks

asked 2 years ago858 views
1 Answer
0
Accepted Answer

Sorry, never mind. It does eventually time out when it is stuck like that. Hopefully this is useful to anyone else!

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions