Custom Sagemaker Batch "Request Splitter"

0

Batch Transform has a number of pre-defined "SplitType" / "BatchStrategy" options that allow certain file-types to be automatically split into batches that are < 64MB. These include CSV and JSONL but not JSON.

I'm using Triton with Batch Transform, it's possible to automate the splitting/reassembling of Triton (if you assume the first dimension of each tensor is the batch dimension) but this requires custom code. Is this possible with batch transform? Can I provide my own transform?

Dave
已提問 10 個月前檢視次數 332 次
2 個答案
0
已接受的答案

Thanks, I don't think those links are relevant to Triton - there's already an AWS Triton container. What I ended up doing that worked fine was to create a Triton pipeline using a Python backend step that performed the batching and a onnx backend for the model.

Dave
已回答 9 個月前
0

The input to batch transforms must be of a format that can be split into smaller files to process in parallel. These formats include CSV, JSON, JSON Lines, TFRecord and RecordIO.

The SplitType parameter indicates how to split the records in the input dataset. To split input files into mini-batches when you create a batch transform job, set the SplitType parameter value to Line. If SplitType is set to None or if an input file can't be split into mini-batches, SageMaker uses the entire input file in a single request. You can control the size of the mini-batches by using the BatchStrategy and MaxPayloadInMB parameters. MaxPayloadInMB must not be greater than 100 MB.

In your use case where you are using Triton with Batch Transform and want to automate the splitting/reassembling of Triton by assuming the first dimension of each tensor is the batch dimension, In order to achieve the same you can Use Your Own Inference Code with Batch Transform and implement the same. You can refer below links for steps and example to bring your own code with batch transform.

[1] https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-batch-code.html [2] https://docs.aws.amazon.com/sagemaker/latest/dg/docker-containers-notebooks.html

[3] https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html [4] https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html

Yash_A
已回答 9 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南