Sagemaker notebook jobs dependency

0

I am currently working on automatically training a model, update the endpoint and logging the process in one workflow by using a jupyter notebook script, and creating a sagemaker notebook job with it. I did all my script building and testing on sagemaker notebook instance JupyterLab. So now I am facing with a question, can I turn off my notebook instance after I created a scheduled job on it? Are the jobs working independently from the notebook instance? To complete the context, this .ipynb script takes care about sagemaker sdk installation, retrieves data from s3 bucket and splits it locally to some files in the same directory it is in. It also output the training log into an s3 bucket. All the permission(sagemaker, s3, etc) is set up using a role assigned to the notebook instance while I was testing, and will also be used in the job definition. Would this job work? And would this job work while the notebook instance I used for testing it is shut down? And more generally, is this a good practice? Sorry for the immature development practice if this is one, because it is my first time using Amazon services to implement a training process.

Yun
已提问 2 个月前480 查看次数
1 回答
1
已接受的回答

Yes, you can stop your SageMaker notebook instance after creating a scheduled notebook job on it. The jobs will run independently from the notebook instance.

When you create a scheduled notebook job, it will use the IAM role and permissions configured for the notebook instance. So the job will have access to the same S3 buckets and be able to call SageMaker APIs.

The job definition specifies the notebook path and schedule. It does not depend on the original notebook instance remaining running. The jobs will execute based on the schedule using the resources defined in the IAM role.

It is generally not required to keep the original notebook instance running after creating scheduled jobs. You can stop the instance to avoid ongoing compute costs. The jobs will still run as scheduled.

For best practices, consider using SageMaker Pipelines for more advanced workflows that chain multiple jobs together based on dependencies. You can define pipelines that run Job A, then Job B, etc.

profile picture
专家
已回答 2 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则