Sagemaker inference time with JSON request

0

Hi,

Currently I'm trying to change framework the model is prepared from Tensorflow to Pytorch. The issue I encounter is long time of request decoding (request with JSON to dict) - it takes around 200 ms to just convert reguest to dict (with json.loads)

def input_fn(input_data, content_type):
    """Placeholder docstring"""
    time_start = time()
    input_data = json.loads(input_data)
    logger.info(f"Input serializer (input_fn) in {round(time() - time_start, 3)} seconds.")
    return input_data

The issue didn't exist when Tensorflow framework was used (with the same request whole model inference took around 100 ms). I was trying with different library type (msgspec) but the decoding time was very similar.

Could you advise some solution to decode JSON faster (somehow it works with tensorflow) so I guess there is a way. Changing the request body type was considered, but with limited access to sender, it's cumbersome.

Deployment script:

    model_container_image = r"763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.1.0-cpu-py310-ubuntu20.04-sagemaker-v1.2"
    model_builder = ModelBuilder(
        model_path=model_path,
        schema_builder=SchemaBuilder(sample_input, sample_output),
        mode=Mode.SAGEMAKER_ENDPOINT,
        content_type='application/json',
        accept_type='application/json',
        role_arn='zzz',
        image_uri=model_container_image,
        inference_spec=YoloX(),
        log_level=logging.DEBUG
    )
    built_model = model_builder.build()
    built_model.deploy(
        instance_type="ml.m5.large",
        endpoint_name=zzz',
        initial_instance_count=1,
        endpoint_logging=True)
Pawe842
asked 7 months ago185 views
1 Answer
0

Did you try using a fast json parser like - https://github.com/ijl/orjson?tab=readme-ov-file . Also if its image data it would be better to use application/x-image as content type.

AWS
answered 6 months ago
profile picture
EXPERT
reviewed 3 months ago
  • Yes, I've tried - msgspec, but the improvement was insufficient. Do you know if there is a way to pars json with torch serve like it is in tensorflow serving?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions