- Mais recentes
- Mais votos
- Mais comentários
Hi Tim, when you create a sagemaker training job using the estimator, the general best practice is to store your data on S3 and the training job will launch instances as requested by the training job configuration. As now we support fast file mode, which allows faster training job start compared to the file mode (which downloads the data from s3 to the training instance). But when you say you used sagemaker notebook instance to train the model, I assume you were not using SageMaker Training jobs but rather running the notebook (.ipynb) on the SageMaker notebook instance. Please note that as SageMaker is a fully managed service, the notebook instance (also training instances, hosting instances etc.) are launched in the service account, so you will not have directly access to those instance. The SageMaker notebook instance use EBS to store data and the EBS volume is mounted to the /home/ec2-user/SageMaker. Please note that the EBS volume used by a SageMaker notebook instance can only be increased but not decrease. If you want to reduce the EBS volume, you need to create a new notebook instance with a smaller volume and move your data from the previous instance via s3. You will not be able to access that EBS volume from outside of the SageMaker notebook instance. The general best practice is to store large dataset on s3 and only use sample data on the SageMaker notebook instance (reduce the storage). Then use that small amount of sample data to test/build your code. Then when you are ready to train on the whole dataset, you can launch a SageMaker training job and use the whole dataset stored on s3. Note that, running the training on the whole dataset on a SageMaker notebook instance will require you to use a big instance with enough computing power and also will not be able to perform distributed training with multiple instances. Comparatively, if you run the training job use SageMaker training instances, it gives you more flexibility of choosing the instance type and allow you to run on multiple instances for distributed training. Lastly, once the SageMaker training job is done, all the resources will be terminated which will save cost compared to continue using the big instance with a SageMaker notebook instance. Hope this has helped answer your question
That's all great advice - thank you, appreciate it!
Hi Melanie, Just wondering if you could let me know if thers a preferable way to create a Sagemaker training job - should I go about it by using sagemaker.session.train(), or by creating an estimator (PyTorch estimator in my case) and then calling estimator.fit()
Many thanks Tim
Conteúdo relevante
- AWS OFICIALAtualizada há um ano
- AWS OFICIALAtualizada há um ano
- AWS OFICIALAtualizada há um ano
- AWS OFICIALAtualizada há um ano
Once you create a model, you can find and download the model through the SageMaker console. If you want to share a model, datasets, or other artifacts from training outside of the notebook instance, I think your best bet is S3. Here are the S3 utilities in the SageMaker Python SDK: https://sagemaker.readthedocs.io/en/stable/api/utility/s3.html?highlight=s3%20upload
Edit: look at SageMaker Feature Store or SageMaker DataWrangler for reusing your preprocessed data.