Error Creating Endpoint

0

Hi! The following error happens while trying to create an endpoint from a successful trained model:

  • In the web console:

The customer:primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.

  • CloudWatch logs:

exec: "serve": executable file not found in $PATH

Im deploying the model using a Lambda step, just as in this notebook. The Lambda step is successful, and I can see in the AWS web console that the model configuration is created with success.

The exact same error happens when I create an endpoint for the registered model in the AWS web console, under Inference -> Models. In the console I can see that an inference container was created for the model, with the following characteristics:

  • Image: 763104351884.dkr.ecr.eu-west-3.amazonaws.com/tensorflow-training:2.8-cpu-py39
  • Mode: single model
  • Environment variables (Key Value):

SAGEMAKER_CONTAINER_LOG_LEVEL 20

SAGEMAKER_PROGRAM inference.py

SAGEMAKER_REGION eu-west-3

SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/model/code

I absolutely have no clue what is wrong and I could not find anything relevant online about this problem. Is it necessary to provide an custom docker image for inference or something?

For more details, please find below the pipeline model steps code. Any help would be much appreciated!

model = Model(
    image_uri=estimator.training_image_uri(),
    model_data=step_training.properties.ModelArtifacts.S3ModelArtifacts,
    sagemaker_session=sagemaker_session,
    role=sagemaker_role,
    source_dir='code',
    entry_point='inference.py'
)
step_model_create = ModelStep(
        name="CreateModelStep",
        step_args=model.create(instance_type="ml.m5.large")
 )

register_args = model.register(
        content_types=["*"],
        response_types=["application/json"],
        inference_instances=["ml.m5.large"],
        transform_instances=["ml.m5.large"],
        model_package_group_name="test",
        approval_status="Approved"
)
step_model_register = ModelStep(name="RegisterModelStep", step_args=register_args)
profile picture
질문됨 일 년 전319회 조회
1개 답변
1
수락된 답변

Hi, the problem here is that your inference model's container URI 763104351884.dkr.ecr.eu-west-3.amazonaws.com/tensorflow-training:2.8-cpu-py39 is using a training image, not an inference image for TensorFlow. Because the images are each optimized for their own function, the serving executable is not available in the training container in this case.

Usually, the framework-specific SDK classes will handle this lookup for you (for example TensorFlowModel(...) as used in the notebook you linked, or when calling sagemaker.tensorflow.TensorFlow.deploy(...) from the Estimator class.

I see here though that you're using the generic Model, so guess you don't know (or don't want to commit to) the framework and version at the point the Lambda function runs?

My suggestions would be:

  • Can you use the Pipelines ModelStep to create your model before calling the Lambda deployment function? Similarly to how your linked notebook uses CreateModelStep. This would build your framework & version into the pipeline definition itself, but should mean that the selection of inference container image gets handled properly & automatically.
  • If you really need to be dynamic, I think you might need to find a way of looking up at least the framework from the training job. From my testing, you can use estimator = sagemaker.tensorflow.TensorFlow.attach("training-job-name") and then model = estimator.create_model(...) to correctly infer the specific inference container version from a training job, but it still relies on knowing that TensorFlow is the correct framework. I'm not aware of a framework-agnostic equivalent? So could e.g. try describing the training job, manually inferring which framework it uses from that information, and then using the relevant framework estimator class' attach() method to figure out the specifics and create your model.
AWS
전문가
Alex_T
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠