Sagemaker inference time with JSON request

0

Hi,

Currently I'm trying to change framework the model is prepared from Tensorflow to Pytorch. The issue I encounter is long time of request decoding (request with JSON to dict) - it takes around 200 ms to just convert reguest to dict (with json.loads)

def input_fn(input_data, content_type):
    """Placeholder docstring"""
    time_start = time()
    input_data = json.loads(input_data)
    logger.info(f"Input serializer (input_fn) in {round(time() - time_start, 3)} seconds.")
    return input_data

The issue didn't exist when Tensorflow framework was used (with the same request whole model inference took around 100 ms). I was trying with different library type (msgspec) but the decoding time was very similar.

Could you advise some solution to decode JSON faster (somehow it works with tensorflow) so I guess there is a way. Changing the request body type was considered, but with limited access to sender, it's cumbersome.

Deployment script:

    model_container_image = r"763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.1.0-cpu-py310-ubuntu20.04-sagemaker-v1.2"
    model_builder = ModelBuilder(
        model_path=model_path,
        schema_builder=SchemaBuilder(sample_input, sample_output),
        mode=Mode.SAGEMAKER_ENDPOINT,
        content_type='application/json',
        accept_type='application/json',
        role_arn='zzz',
        image_uri=model_container_image,
        inference_spec=YoloX(),
        log_level=logging.DEBUG
    )
    built_model = model_builder.build()
    built_model.deploy(
        instance_type="ml.m5.large",
        endpoint_name=zzz',
        initial_instance_count=1,
        endpoint_logging=True)
1 Risposta
0

Did you try using a fast json parser like - https://github.com/ijl/orjson?tab=readme-ov-file . Also if its image data it would be better to use application/x-image as content type.

AWS
con risposta 5 mesi fa
profile picture
ESPERTO
verificato un mese fa
  • Yes, I've tried - msgspec, but the improvement was insufficient. Do you know if there is a way to pars json with torch serve like it is in tensorflow serving?

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande