Can we run a python script in Sagemaker using boto3 from a local machine?


Here's what I am trying to do: In my application that resides outside aws, I take some user inputs, and trigger scripts that reside inside Sagemaker notebook instance. I am able to start or create a new instance using boto3, and also use lifecycle configuration to run some starter script while the instance turns on. But I want to run multiple scripts in short intervals based on user inputs, so I don't want to restart my instance each time with a new lifecycle configuration script. I am trying to find if there is a way to execute shell commands in sagemaker using boto3 (or any other way).

asked 9 months ago831 views
1 Answer

It should be possible, but it's probably not a great idea...

This is not really an intended pattern for SageMaker notebooks today, and it's more likely that you should be using SageMaker Processing Jobs to schedule your regular tasks - taking input and output data direct from S3 rather than relying on local notebook storage.

With that warning out of the way, a hacky solution:

SageMaker notebooks (both Notebook Instances and Studio) are based on Jupyter and thus today more-or-less conform (with some customizations) to Jupyter's client/server API model, which has both REST and WebSocket/ZeroMQ aspects. This means as long as you're able to handle authentication, it's possible to interact with the notebooks from a script using the same interfaces your browser would.

This automation-style solution would proceed as (assuming Python):

  • Use boto3 and the SageMaker CreatePresignedNotebookInstanceUrl API to create a presigned notebook instance URL (Granting this IAM permission is what allows a User/Role/principal to open the notebook)
  • Use a stateful HTTP library like requests to request this URL in a session and and save the cookie data set by the response. Fetching the URL logs your client in to Jupyter, and "your client" is the session - need to keep it persistent.
  • Use the JupyterServer REST APIs for things like opening terminal or notebook sessions, listing available kernels, listing open sessions, etc.
  • When you have a session open (terminal or notebook), use a WebSocket client library like websocket-client to interact with it (sending commands, receiving results, etc). Remember you'll need to use your same session for authentication.

I think I only have end-to-end examples of this for SMStudio: The deprecated auto-installer of the official SageMaker Studio Auto-Shutdown Extension used to use this method before SMStudio Lifecycle Configuration Scripts became available, and some rough draft PoCs on GitHub explore the notebook side too but always with ref to Studio. However it should be possible for NBIs too with almost the same process: Just need to use the above mentioned API in place of CreatePresignedDomainUrl, and may need to check whether the REST api_base_url needs to be adjusted.

It might even be possible to use a higher-level solution like the nbclient library if you can get the authentication to work with it - would be interested to hear if anyone does!

answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions