To add to it, you can set the BatchStrategy to
MultiLine in order to speed up the processing.
General guideline is - number of workers/instances is a multiple of number of files in S3 to process.
If MaxConcurrentTransforms is set to 0 or left unset, Amazon SageMaker checks the optional execution-parameters to determine the settings for your chosen algorithm
It partitions the Amazon S3 objects in the input by key. Please checkout https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform.html
how to configure max concurrent transforms and instance count parameter in batch transform ?
Sagemaker Batch Transform Job Failure: Timeout Issue and Job Restarted Unexpectedlyasked 7 months ago
how to configure ideal value for MaxConcurrentTransforms in setting up a sagemaker batch transform ?Accepted Answer
SageMaker XGBoost batch transform AttributeErrorasked 3 months ago
how to log error/messages in while running a sagemaker batch transform job?asked 4 months ago
Sagemaker Pipelines - Batch Transform job using generated predictions as input for the modelasked 4 months ago
Is there a way to automate failure handling and retries when using Amazon SageMaker batch transform?Accepted AnswerEXPERTasked 2 years ago
How to avoid multiple “Completed” events from SageMaker batch transform jobasked 2 years ago
Call last Sagemaker Model in Batch Transform JobsAccepted Answerasked 4 months ago
Extending Docker image for SageMaker Inferenceasked 4 months ago