Skip to content

Sagemaker CfnModel not respecting model_data_url or model_data_source

0

Trying to deploy a pre-trained model uploaded to S3 to an endpoint using CDK. However, when I create the model with cdk:

        model = sagemaker.CfnModel(self, "MyModel-24",
            execution_role_arn=sagemaker_role.role_arn,
            primary_container={
                "image": "763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:1.9.0-cpu-py38-ubuntu20.04",
                "model_data_source": {
                    "s3DataSource": {
                        "s3Uri": "s3://my-models/latest/model.tar.gz"
                    }
                },
                "environment": {
                    "SAGEMAKER_PROGRAM": "inference.py",
                    "SAGEMAKER_SUBMIT_DIRECTORY": "s3://my-models/latest/",
                    "SAGEMAKER_REQUIREMENTS": "requirements.txt",
                    "SAGEMAKER_REGION": "us-east-1"
                }
            }
        )

I go to the aws console for the model but the "Model data location" is "-", as if it didn't read the URL. I get the same when I run model_data_url. Interestingly, even when I put an invalid S3 URI in there it will create but give a null Model data Location.

Is there a glitch in CDK for deploying a model from data in S3?

1 Answer
2
Accepted Answer

It looks like you're using CDK via Python. In a similar stack of mine that seems to be working okay, the main difference seems to be that I'm using the ContainerDefinitionProperty class rather than a plain dict:

sagemaker.CfnModel(
    self,
    "MyModel",
    execution_role_arn=sagemaker_role.role_arn,
    # Note class here:
    primary_container=sagemaker.CfnModel.ContainerDefinitionProperty(
        environment={...},
        image="...",
        model_data_url="s3://...",
    ),
)

Full disclosure: I'm actually referencing asset.s3_object_url rather than an s3:// URI string, where asset is an instance of s3assets.Asset. But I'm pretty sure from the s3assets.Asset.s3ObjectUrl doc that it should behave the same way.

You can find more complete code here - but that sample currently has some other deployment issues with fix in progress on PR50.

AWS
EXPERT
answered a year ago
EXPERT
reviewed a year ago
  • Yep, needed that class. Really wish it would have thrown an error on the plain dict since that definitely wasn't working. Thanks so much, you ended several days of torment.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.