- Newest
- Most votes
- Most comments
A bit late to this question but:
This is very unusual because normally the starting should time out (and the space fail to start) after a few minutes. In CLI/API terms, the storage and space definition is the "space" while the running container instance is an "app": so it'd be the list-apps and delete-app APIs you'd want for finding the failing-to-start container and sending it an explicit stop signal. You should also be able to find your space through the SageMaker Console Admin configurations > Domains > {your domain} > Space management screen as shown below, then click through on the name of your space to see (and possibly delete) any "apps" running on it.
If your space won't (re)start even after checking you have no lifecycle configuration scripts enabled on it (visible in space details in SageMaker Studio), then it sounds like something in your storage (i.e. that .condarc file) is fundamentally clashing with the container image itself and preventing it from starting healthily.
You could try upgrading or downgrading the "image" on your space (again, see space details in SageMaker Studio - this drop-down will only be enabled when your space is in "stopped" status) to see if that will resolve the issue.
Unfortunately I'm not aware of any way to mount the space's EBS volume to a different/non-SageMaker instance... So my only other suggestions would be:
- Try setting up a lifecycle configuration script to repair your condarc and attaching that to the space. IF your space makes it far enough through the start-up process to trigger your script, then hopefully that can resolve the issue. LCC logs should appear in the
/aws/sagemaker/studiolog group in CloudWatch, under a stream named as{domain-ID}/{space-name}/CodeEditor/default/LifecycleConfigOnStart - Try setting up a custom container image and using that instead of the SageMaker Distribution image for your space. Even if it's not a fully-functional Code Editor image that could get all the way through start-up, maybe it could be one with an
ENTRYPOINTthat will properly clean up/home/sagemaker-user/.condarc- so that you can switch back to the vanilla image and re-start?
Maybe you already gave up on recovering this space, but if so hope this at least might be useful to somebody in future! 😓
Relevant content
- asked 2 years ago
