Since yesterday, Sagemaker Studio has started giving me an error every time I want to open a notebook using a custom image.
I get this error:
Failed to start kernelFailed to launch app [**-**-**-ml-m5-large-309d4926425841270d******]. CustomImageError: SageMaker is unable to create an App using the specified ECR image [******.dkr.ecr.us-east-1.amazonaws.com/ecr-sagemaker-shared-services-image@sha256:*****41c91aa96c8ad69a5abea60dfa58edccf06f48f64189d9] .
Inspect the cloudwatch logs for detailed diagnostic information. (Context: RequestId: 35dcea16-c511-40cd-ba69-****, TimeStamp: 1694525815.2390692, Date: Tue Sep 12 13:36:55 2023)
When I check the cloudwatch logs, I don't see any errors or anything unusual.
timestamp,message
1694525797638,'"+ CONDA_DIR=/opt/.sagemakerinternal/conda"
1694525797638,'"+ CONDA_ENV_FILTER=/opt/conda$"
1694525797638,'"+ command -v python"
1694525797638,'"+ [ 0 -eq 0 ]"
1694525797638,'"+ python -c from future import print_function;import sys; print(sys.prefix)"
1694525797638,'"+ SYSTEM_PYTHON_PREFIX=/opt/conda"
1694525797638,'"+ export JUPYTER_PATH=/opt/conda/share/jupyter/"
1694525797638,'"+ [ ! -f /opt/conda/share/jupyter/kernels/python3/kernel.json ]"
1694525797638,'"+ echo Using system included Python3 kernel."
1694525797638,'"+ export PATH=/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/tmp/miniconda3/condabin:/tmp/anaconda3/condabin:/tmp/miniconda2/condabin:/tmp/anaconda2/condabin:/tmp/mambaforge/condabin"
1694525797638,'"+ export AWS_SAGEMAKER_PYTHONNOUSERSITE=0"
1694525797638,'"+ PYTHONNOUSERSITE=1 /opt/.sagemakerinternal/conda/bin/jupyter-kernelgateway --ip 0.0.0.0 --port 8888 --JupyterWebsocketPersonality.list_kernels=True --KernelSpecManager.ensure_native_kernel=False --MultiKernelManager.default_kernel_name= --KernelGatewayApp.kernel_spec_manager_class=nb_conda_kernels.CondaKernelSpecManager --CondaKernelSpecManager.env_filter=/opt/conda$"
1694525802125,Using system included Python3 kernel.
1694525802125,"[KernelGatewayApp] [nb_conda_kernels] enabled, 2 kernels found"
1694525806638,[KernelGatewayApp] Jupyter Kernel Gateway at http://0.0.0.0:8888
Also, If I start an instance with a default kernel and then switch to my custom image it loads without errors. This error seems to only happen when starting a new instance.
Neither the image nor the config have been updated. It started happening out of nowhere