Nessuna risposta
- Più recenti
- Maggior numero di voti
- Maggior numero di commenti
Contenuto pertinente
- AWS UFFICIALEAggiornata 3 anni fa
- Perché il mio endpoint Amazon SageMaker entra in stato di errore quando creo o aggiorno un endpoint?AWS UFFICIALEAggiornata un anno fa
- AWS UFFICIALEAggiornata 4 mesi fa
Have you tried different type of instance or different AZ ?
I have tried ml.g4dn.2xlarge and ml.g4dn.xlarge. In my case, I need to use the g4dn family of machines.
how are you deploying your model. Can you share example code.
I'm just using the console to create the Model, EndpointConfig, and Endpoint right now, so no code to share for those steps.
I spent time analyzing the CloudTrail events, and I can see that sagemaker goes through the process of downloading all of the image layers 4x before finally failing. None of the API calls available in CloudTrail report errors, but I think there must be a failure that's happening after the image layers have been downloaded that is triggering a series of retries. I'm stumped as to what that failure might be, since there are no events or logs associated with it.