- 최신
- 최다 투표
- 가장 많은 댓글
Hi there,
Without knowing more details about your SageMaker Endpoint, it will be difficult to properly debug your issue. I would like to suggest that, if possible, you open a case with SageMaker Premium Support.
With that being said, I have done some testing with one of our SageMaker samples that uses Business Logic Scripting (BLS) with StableDiffusion on a Triton Inference container. The model.py script demonstrates how to use pb_utils.InferenceRequest
which is very similar to the official example. I was able to deploy the endpoint and invoke the model successfully. The container used in testing was 785573368785.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tritonserver:22.10-py3
There are a few bugs that I did encounter with the notebook itself such as the name of the main model under the /model_repository/pipeline directory sd_env.tar.gz
causing an error when testing the container locally, changing the name to hf_env.tar.gz
fixed this issue. Also please use the following lines of code when waiting for the endpoint to be InService
:
resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)
while status == "Creating":
time.sleep(60)
resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)
print("Arn: " + resp["EndpointArn"])
print("Status: " + status)
You may have issues running the !pip install -U sagemaker pywidgets numpy PIL
if using a notebook instance, I only updated the SageMaker SDK and did not encounter any issues with the other libraries. Please note I used an ml.g5.4xlarge instance and the conda_python3
kernel during my testing.
Thanks @Thayin
I read through the example, and figured out what the key step is:
I'd already suspected this from something else I read but could not work out how to rectify so thanks for the example! Basically, Triton loads models on demand. As far as I can tell If you use an ensemble, it pre-loads the steps, but if you invoke one model from another it's unaware of the dependency and doesn't (or at least may not) preload the target model.
So you need to explicitly preload the models, i..e
container = {
"Image": mme_triton_image_uri,
"ModelDataUrl": model_data_url,
"Environment": {
"SAGEMAKER_TRITON_DEFAULT_MODEL_NAME": "pipeline",
"SAGEMAKER_TRITON_LOG_INFO": "false --load-model=text_encoder --load-model=vae",
},
}
Thanks for you help!
PS This seems to be a slight hack - you seem to be taking advantage of the fact that the setting that controls logging is appended to the container start command and adding additional commands on the end - this seems a little fragile and should perhaps be added explicitly as it's own Environment variable to ensure it's explicitly maintained?