how to choose an instance type for a sagemaker testing/inference?


looking at few examples, for training in sagemaker . are there some guidelines based on the model size, data to be trained , what type of instance cpu/gpu to use? also, can one use spot instances ( may be with multiple gpu cores)?

Yes, you can use spot instances. I recommend it, and always run training on spot instances. If you are using the Python SDK, add the following parameters to your Estimator:

       max_run={maximum runtime here},
       max_wait={maximum wait time},
       checkpoint_s3_uri={URI of your bucket and folder },

See the documentation for more details here:

As far as instance types are concerned, the individual algorithms contain some initial recommendations for instances types:

For example, see the EC2 Instance Recommendation for the Image Classification Algorithm:

There was a presentation at re:Invent 2020 - How to choose the right instance type for ML inference:

answered 6 months ago

And for the selection of instance type for inference, you might want to look at Amazon SageMaker Inference Recommender:

answered 5 months ago

