How do you choose the number of Studio Notebook Instances per Data Scientist?

0

When calculating the cost of SageMaker Studio Notebooks in the AWS Pricing Calculator, it asks you for the "Number of Studio Notebook instances per data scientist per month."

How do you reason about this? What would be the use case for having multiple instances for one data scientist? Would that happen if an individual is working on multiple projects, which have different kernels and library dependencies?

I imagine most of the time it will be 1 Studio Notebook instance per data scientist per month, instead of 2 or more instances per data scientist?

AWS
asked a year ago794 views
2 Answers
1
Accepted Answer

Hi @yann_stoneman, you're right. Up to 4 apps can run on the same instances, so different kernels could still be run on the same instance. For example, a data scientist could be working on a tabular use case, and an image processing use case - so they might have a CPU and GPU instance running. Or they might use a larger instance for data processing or data wrangler feature.

Depending on your data scientists' projects and use cases, I'd account for at most 2 instances per data scientist running concurrently. If your users already use SageMaker Notebook Instances, you can use the commonly used resource type as the Studio instance resource type for estimates - that way you can get a closer estimate to the actual costs.

If you're allowing for shared spaces (real time collaboration), include additional instances in your estimate - the users will now be able to use a private space through their user profile (unique to one user) and a shared space (this instance can be accessed across profiles).

I'd also recommend using a plugin to shut down idle instances as a best practice when your teams are onboarded to Studio, so these instances are shut down if there are no notebooks actively running (ref: https://aws.amazon.com/blogs/machine-learning/save-costs-by-automatically-shutting-down-idle-resources-within-amazon-sagemaker-studio/)

AWS
Durga_S
answered a year ago
  • And the ability to run 4 different kernels on the same instance, I assume, is because every SageMaker Studio App runs in a Docker container?

  • Yes! The hard limit on that is 4 apps per instance as of today. One more call out is that a user cannot run two instances of the same type, i.e., only one instance of any given instance type allowed per user profile (without using shared spaces).

0

Notebook instances are not connected to the user. So if two users has the same access rights they will see and will be able to access the same instance (even in the same time).

The issue is - Jupyter Notebook is not ready for that, both users will have the same privileges, no tracking who did what, ... And working on the same notebook on the same time - basically they will overwrite each other saves.

I had a need for similar thing (pair programming - data scientist and software engineer) - the only viable solution we were able to find was desktop sharing (like TeamViewer, ...)

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions