How to create (Serverless) SageMaker Endpoint using exiting tensorflow pb (frozen model) file?

0

Note: I am a senior developer, but am very new to the topic of machine learning.

I have two frozen TensorFlow model weight files: weights_face_v1.0.0.pb and weights_plate_v1.0.0.pb. I also have some python code using Tensorflow 2, that loads the model and handles basic inference. The models detect respectively faces and license plates, and the surrounding code converts an input image to a numpy array, and applies blurring to the images in areas that had detections.

I want to get a SageMaker endpoint so that I can run inference on the model. I initially tried using a regular Lambda function (container based), but that is too slow for our use case. A SageMaker endpoint should give us GPU inference, which should be much faster.

I am struggling to find out how to do this. From what I can tell reading the documentation and watching some YouTube video's, I need to create my own docker container. As a start, I can use for example 763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.8.0-gpu-py39-cu112-ubuntu20.04-sagemaker.

However, I can't find any solid documentation on how I would implement my other code. How do I send an image to SageMaker? Who tells it to convert the image to numpy array? How does it know the tensor names? How do I install additional requirements? How can I use the detections to apply blurring on the image, and how can I return the result image?

Can someone here please point me in the right direction? I searched a lot but can't find any example code or blogs that explain this process. Thank you in advance! Your help is much appreciated.

1 Answer
0

You should package the model files into model.tar.gz file and use TensorFlowModel object to deploy the mode in SageMaker Endpoint.

You can see and example here.

This is and example for the same, but with PyTorch.

profile pictureAWS
answered 2 years ago
  • I've put one of my .pb files in a .tar.gz and uploaded it to s3.

    Then I tried the following in a notebook:

    from sagemaker.local import LocalSession
    from sagemaker.tensorflow import TensorFlow, TensorFlowModel
    
    session = LocalSession()
    session.config = {'local': {'local_code': True}}
    
    role = 'arn:aws:iam::xxxxxxxxxxxxxxxxx:role/SageMakerRole'
    model_dir = 's3://xxxxxxxxxxxxxxxxx/anonymiser/face.tar.gz'
    
    model = TensorFlowModel(
        entry_point='inference.py', # copied from example code
        source_dir = './code',
        role=role,
        model_data=model_dir,
        framework_version='2.3.0',
    )
    
    predictor = model.deploy(initial_instance_count=1, instance_type='ml.t2.medium')

    This has been running now for more than an hour, but still pending. I believe something has crashed. I also can't view the endpoint in the console, even though it shows up in the menu.

    Any thoughts?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions