Sagemaker inference time with JSON request

0

Hi,

Currently I'm trying to change framework the model is prepared from Tensorflow to Pytorch. The issue I encounter is long time of request decoding (request with JSON to dict) - it takes around 200 ms to just convert reguest to dict (with json.loads)

def input_fn(input_data, content_type):
    """Placeholder docstring"""
    time_start = time()
    input_data = json.loads(input_data)
    logger.info(f"Input serializer (input_fn) in {round(time() - time_start, 3)} seconds.")
    return input_data

The issue didn't exist when Tensorflow framework was used (with the same request whole model inference took around 100 ms). I was trying with different library type (msgspec) but the decoding time was very similar.

Could you advise some solution to decode JSON faster (somehow it works with tensorflow) so I guess there is a way. Changing the request body type was considered, but with limited access to sender, it's cumbersome.

Deployment script:

    model_container_image = r"763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.1.0-cpu-py310-ubuntu20.04-sagemaker-v1.2"
    model_builder = ModelBuilder(
        model_path=model_path,
        schema_builder=SchemaBuilder(sample_input, sample_output),
        mode=Mode.SAGEMAKER_ENDPOINT,
        content_type='application/json',
        accept_type='application/json',
        role_arn='zzz',
        image_uri=model_container_image,
        inference_spec=YoloX(),
        log_level=logging.DEBUG
    )
    built_model = model_builder.build()
    built_model.deploy(
        instance_type="ml.m5.large",
        endpoint_name=zzz',
        initial_instance_count=1,
        endpoint_logging=True)
Pawe842
gefragt vor 5 Monaten148 Aufrufe
1 Antwort
0

Did you try using a fast json parser like - https://github.com/ijl/orjson?tab=readme-ov-file . Also if its image data it would be better to use application/x-image as content type.

AWS
beantwortet vor 5 Monaten
profile picture
EXPERTE
überprüft vor einem Monat
  • Yes, I've tried - msgspec, but the improvement was insufficient. Do you know if there is a way to pars json with torch serve like it is in tensorflow serving?

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen