SageMaker Edge Manager cannot load Neo-compiled TensorFlow v1 model

0

Hi,

I followed this example up to the point of training (mnist_estimator.fit(inputs)), then continued on the Console to compile the model for rasp4b, then package it as a Greengrass V2 component. After deploying the component to a Raspberry Pi 4B, along with the SageMaker Edge Manager component, on asking the Edge Manager to load the model, there is an error. This is in the Edge Manager logs:

2023-01-20T21:03:56.651Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.650][I] Validating SM Edge Cloud model. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.651Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.650][I] root certs folder /greengrass/v2/work/aws.greengrass.SageMakerEdgeManager/certs. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.653Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Certificate chain is valid. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.653Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Certificate chain is validated with root cert: /greengrass/v2/work/aws.greengrass.SageMakerEdgeManager/certs/eu-west-1.pem. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.653Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Validating SM Edge Cloud model signature against the provided certificate chain. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.653Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Extract Public key from certificate successfully. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Extract model signature successfully. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.653][I] Verify init successfully..... {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.654][I] Digest verify succeed. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.654][I] Validating SM Edge Cloud model signature successfully. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:56.654Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:56.654][I] Blake2s256 hash /greengrass/v2/work/mnist-test//manifest successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.603Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.603][I] Blake2s256 hash /greengrass/v2/work/mnist-test//code.ro successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.604Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.603][I] Blake2s256 hash /greengrass/v2/work/mnist-test//compiled.meta successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.611Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.610][I] Blake2s256 hash /greengrass/v2/work/mnist-test//compiled.so successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.612Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.612][I] Blake2s256 hash /greengrass/v2/work/mnist-test//dlr.h successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.857Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.857][I] Blake2s256 hash /greengrass/v2/work/mnist-test//libdlr.so successfuly.. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.858Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.858][I] Creating model object from /greengrass/v2/work/mnist-test/. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.858Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.858][I] executing LoadDLR. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.858Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.858][I] attempting to open dlr from /greengrass/v2/work/mnist-test//libdlr.so. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.859Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.858][I] Model will run on CPU. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.907Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] Created model object from /greengrass/v2/work/mnist-test/. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.908Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] backend name is relayvm. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.908Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] DLR backend = kRELAYVM. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] Finished populating metadata. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][I] Dynamic Ouput Tensor, querying for shape with backend. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][E] Output Shape of dynamic model not available, call after predict(). {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][E] failed to access output shape. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}
2023-01-20T21:03:57.909Z [INFO] (Copier) aws.greengrass.SageMakerEdgeManager: stdout. {"version":"1.20220822.836f3023"}[2023-01-20T21:03:57.907][E] Failed to construct model meta, LoadModel request failed. {scriptName=services.aws.greengrass.SageMakerEdgeManager.lifecycle.run.script, serviceName=aws.greengrass.SageMakerEdgeManager, currentState=RUNNING}

The compiled.meta file, on the device, looks like this:

{
    "Requirements": {
        "TargetDevice": "RASP4B",
        "TargetDeviceType": "cpu"
    },
    "Compilation": {
        "CreatedTime": 1674235990.142639
    },
    "Model": {
        "Inputs": [
            {
                "name": "Placeholder",
                "dtype": "float32",
                "shape": [
                    null,
                    784
                ]
            }
        ],
        "Outputs": [
            {
                "name": "softmax_tensor:0",
                "dtype": "float32",
                "shape": [
                    -1,
                    10
                ]
            }
        ]
    }
}

I am unfamiliar with TensorFlow, and I don't know whether this error is something to do with the model itself, its Neo compilation, or Edge Manager.

Thank you for taking the time to read my problem, I'll be thankful for any help you can give.

2개 답변
0
수락된 답변

For anyone who comes across this, the problem was in the example code:

https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker_neo_compilation_jobs/tensorflow_distributed_mnist/mnist.py#L74

inputs = {INPUT_TENSOR_NAME: tf.placeholder(tf.float32, [None, 784])}

should be

inputs = {INPUT_TENSOR_NAME: tf.placeholder(tf.float32, [1, 784])}
m
답변함 일 년 전
0

SageMaker Edge Manager does not support models with dynamic tensors, which is the case with the model you are trying to use. In this specific case the dynamic dimension is the number of images you submit for predition. You need to set that to 1 at compilation time - as demonstrated in the sample notebook:

output_path = "/".join(mnist_estimator.output_path.split("/")[:-1])
optimized_estimator = mnist_estimator.compile_model(
    target_instance_family="ml_c5",
    input_shape={"data": [1, 784]},  # Batch size 1, 1 channel, 28*28 image size.
    output_path=output_path,
    framework="tensorflow",
    framework_version="1.15.3",
)
AWS
전문가
답변함 일 년 전
  • Thank you for your response. I have tried both compiling from the Console and compiling with the SDK using the lines you quoted. The input shape in both cases appears to have been accepted as [1, 784] in the compilation jobs, but after packaging as a Greengrass component, the compiled.meta in the packaged artifacts always lists [null, 784] as the input shape.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인