- Newest
- Most votes
- Most comments
Thanks, I don't think those links are relevant to Triton - there's already an AWS Triton container. What I ended up doing that worked fine was to create a Triton pipeline using a Python backend step that performed the batching and a onnx backend for the model.
The input to batch transforms must be of a format that can be split into smaller files to process in parallel. These formats include CSV, JSON, JSON Lines, TFRecord and RecordIO.
The SplitType parameter indicates how to split the records in the input dataset. To split input files into mini-batches when you create a batch transform job, set the SplitType parameter value to Line. If SplitType is set to None or if an input file can't be split into mini-batches, SageMaker uses the entire input file in a single request. You can control the size of the mini-batches by using the BatchStrategy and MaxPayloadInMB parameters. MaxPayloadInMB must not be greater than 100 MB.
In your use case where you are using Triton with Batch Transform and want to automate the splitting/reassembling of Triton by assuming the first dimension of each tensor is the batch dimension, In order to achieve the same you can Use Your Own Inference Code with Batch Transform and implement the same. You can refer below links for steps and example to bring your own code with batch transform.
[1] https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-batch-code.html [2] https://docs.aws.amazon.com/sagemaker/latest/dg/docker-containers-notebooks.html
[3] https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html [4] https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 days ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 years ago