- Newest
- Most votes
- Most comments
Hello,
Please find the Sagemaker distribution image used for JupyterLab below. You can customise this image as per your requirement.
[+] https://github.com/aws/sagemaker-distribution
Additionally, please find the steps and sample images for building a custom JupyterLab image.
[+] https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated-jl-admin-guide-custom-images.html
Hope this helps.
You can take a look at latest here https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/dlc-release-notes.html and https://github.com/aws/deep-learning-containers/releases/tag/v1.0-pt-ec2-2.4.0-tr-py311
for example, you can try 763104351884.dkr.ecr.us-west-2.amazonaws.com/pytorch-training:2.4.0-cpu-py311-ubuntu22.04-ec2-v1.0
The base images you linked are the same as what I linked. Those base container does not work on SageMaker JupyterLab. They keep giving "internal failure". Have you or any one from your team tried using custom image on SageMaker JupyterLab? Could you share a dockerfile that works?
I have been poking around different ways of building containers. But eventually they all give me "internal failure". This is a very obscure error message that I have no direction to fix it.
Did you look at logs for the notebook instance for details? There could be multiple reasons, instance might not have sufficient resources to run the image properly, try using larger instance and the issue may be with notebook instance configuration as well, not knowing details tough to point out the issue.
For creating a custom GPU-enabled container for SageMaker JupyterLab, you can use the NVIDIA CUDA base images as a starting point. While there isn't a specific pre-built GPU image for SageMaker JupyterLab, you can build upon the NVIDIA CUDA images and add the necessary components for SageMaker compatibility.
To create a custom container for SageMaker JupyterLab:
- Start with an NVIDIA CUDA base image in your Dockerfile.
- Install the required dependencies, including PyTorch, NCCL, and other necessary libraries.
- Add the SageMaker-specific components and tools.
- Install any additional packages or libraries you need for your specific use case.
It's important to note that the process for creating a custom container for SageMaker JupyterLab may differ slightly from creating one for SageMaker Studio Classic or SageMaker training jobs. The "internal failure" you're experiencing when using your custom container with SageMaker JupyterLab could be due to missing components or incompatibilities specific to the JupyterLab environment.
To resolve this issue, you may need to:
- Ensure your Dockerfile includes all necessary components for SageMaker JupyterLab compatibility.
- Double-check that you're using the correct base image and versions of libraries compatible with SageMaker JupyterLab.
- Review the SageMaker documentation for any specific requirements or guidelines for custom containers in JupyterLab.
- Consider reaching out to AWS support for assistance with the "internal failure" error, as it may be related to specific configuration issues.
Remember that creating custom containers for SageMaker environments can be complex, and you may need to iterate on your Dockerfile to achieve full compatibility with SageMaker JupyterLab.
Sources
Create your own Docker container with the SageMaker distributed data parallel library - Amazon SageMaker
Bring your own SageMaker image - Amazon SageMaker
This response is not really helpful. I am specifically asking for what correct base image to use.
Relevant content
- asked 4 months ago
- asked 4 months ago
Thank you for the link to the container images. But the SageMaker Distribution images still give me
"InternalFailure"
. You can reproduce the error by spinning up a SageMaker JupyterLab withpublic.ecr.aws/sagemaker/sagemaker-distribution:2.0-cpu
on instanceml.t3.medium
. I uploaded the image to my ECR without modification.Hello Tim,
Sorry for replying late.
Where you able to run the container locally before you deploy it on Sagemaker studio?. You can run it using
docker run -it <image_id>
and access Jupyterlab running locally.If the image runs locally fine with out any issues,I would suggest you to review your app image config specially to make sure you creating it for right app type i.e
Jupyterlab
. It should look like some thing below.thank you, and yes I was able to start a SageMaker JupyterLab. The not-obvious part was that calling
jupyter-lab
as the entry point was necessary. I thought the pre-built image would already include the setup of entry point etc. But I was able to figure it out following the doc you linked