SageMaker Edge Manager cannot load Neo-compiled TensorFlow v1 model

0

Hi,

I followed this example up to the point of training (mnist_estimator.fit(inputs)), then continued on the Console to compile the model for rasp4b, then package it as a Greengrass V2 component. After deploying the component to a Raspberry Pi 4B, along with the SageMaker Edge Manager component, on asking the Edge Manager to load the model, there is an error. This is in the Edge Manager logs:

2023-01-20T21:03:56.651Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.650][I] Validating SM Edge Cloud model. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.651Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.650][I] root certs folder /greengrass/v2/work/aws.greengrass.SageMakerEdgeManager/certs. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.653Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Certificate chain is valid. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.653Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Certificate chain is validated with root cert: /greengrass/v2/work/aws.greengrass.SageMakerEdgeManager/certs/eu-west-1.pem. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.653Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Validating SM Edge Cloud model signature against the provided certificate chain. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.653Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Extract Public key from certificate successfully. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Extract model signature successfully. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Verify init successfully..... {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.654][I] Digest verify succeed. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.654][I] Validating SM Edge Cloud model signature successfully. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.654][I] Blake2s256 hash /greengrass/v2/work/mnist-test//manifest successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.603Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.603][I] Blake2s256 hash /greengrass/v2/work/mnist-test//code.ro successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.604Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.603][I] Blake2s256 hash /greengrass/v2/work/mnist-test//compiled.meta successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.611Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.610][I] Blake2s256 hash /greengrass/v2/work/mnist-test//compiled.so successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.612Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.612][I] Blake2s256 hash /greengrass/v2/work/mnist-test//dlr.h successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.857Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.857][I] Blake2s256 hash /greengrass/v2/work/mnist-test//libdlr.so successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.858Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.858][I] Creating model object from /greengrass/v2/work/mnist-test/. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.858Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.858][I] executing LoadDLR. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.858Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.858][I] attempting to open dlr from /greengrass/v2/work/mnist-test//libdlr.so. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.859Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.858][I] Model will run on CPU. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.907Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] Created model object from /greengrass/v2/work/mnist-test/. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.908Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] backend name is relayvm. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.908Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] DLR backend = kRELAYVM. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] Finished populating metadata. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] Dynamic Ouput Tensor, querying for shape with backend. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][E] Output Shape of dynamic model not available, call after predict(). {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][E] failed to access output shape. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][E] Failed to construct model meta, LoadModel request failed. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}

The compiled.meta file, on the device, looks like this:

{
    "Requirements": {
        "TargetDevice": "RASP4B",
        "TargetDeviceType": "cpu"
    },
    "Compilation": {
        "CreatedTime": 1674235990.142639
    },
    "Model": {
        "Inputs": [
            {
                "name": "Placeholder",
                "dtype": "float32",
                "shape": [
                    null,
                    784
                ]
            }
        ],
        "Outputs": [
            {
                "name": "softmax_tensor:0",
                "dtype": "float32",
                "shape": [
                    -1,
                    10
                ]
            }
        ]
    }
}

I am unfamiliar with TensorFlow, and I don't know whether this error is something to do with the model itself, its Neo compilation, or Edge Manager.

Thank you for taking the time to read my problem, I'll be thankful for any help you can give.

2 Answers
0
Accepted Answer

For anyone who comes across this, the problem was in the example code:

https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker_neo_compilation_jobs/tensorflow_distributed_mnist/mnist.py#L74

inputs = {INPUT_TENSOR_NAME: tf.placeholder(tf.float32, [None, 784])}

should be

inputs = {INPUT_TENSOR_NAME: tf.placeholder(tf.float32, [1, 784])}
m
answered a year ago
0

SageMaker Edge Manager does not support models with dynamic tensors, which is the case with the model you are trying to use. In this specific case the dynamic dimension is the number of images you submit for predition. You need to set that to 1 at compilation time - as demonstrated in the sample notebook:

output_path = "/".join(mnist_estimator.output_path.split("/")[:-1])
optimized_estimator = mnist_estimator.compile_model(
    target_instance_family="ml_c5",
    input_shape={"data": [1, 784]},  # Batch size 1, 1 channel, 28*28 image size.
    output_path=output_path,
    framework="tensorflow",
    framework_version="1.15.3",
)
AWS
EXPERT
answered a year ago
  • Thank you for your response. I have tried both compiling from the Console and compiling with the SDK using the lines you quoted. The input shape in both cases appears to have been accepted as [1, 784] in the compilation jobs, but after packaging as a Greengrass component, the compiled.meta in the packaged artifacts always lists [null, 784] as the input shape.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions