SageMaker Neo Compilation - Unable to Neo Compile for FP16 and INT8 precision

Question

I'm trying to Neo compile a Pytorch YoloV5 Large model for edge deployment on an Nvidia Jetson Xavier NX device. I'm able to do it using the default settings for FP32 precision but I'm unable to do it for FP16 or INT8 precision. I have tried passing it in "CompilerOptions" in the OutputConfig but the output of Neo compilation is still FP32.

How can I get a Neo compiled model for for FP16 and INT8 precision? Does Neo support these precision modes or not?

Answer

Unfortunately Neo doesn't support quantization for Jetson Devices. It means you can only compile FP32 models and they will be FP32 after compilation.

I know this is not what you're looking for, but FYI, Neo supports int8 model optimization only for TFLite and targeting CPU not GPU. Check here some supported models:
https://docs.amazonaws.cn/en_us/sagemaker/latest/dg/neo-supported-edge-tested-models.html

SageMaker Neo Compilation - Unable to Neo Compile for FP16 and INT8 precision

Contenus pertinents