1 Risposta
- Più recenti
- Maggior numero di voti
- Maggior numero di commenti
0
Please refer to the link below related to the Amazon SageMaker endpoints and quotas: https://docs.aws.amazon.com/general/latest/gr/sagemaker.html
Per the link - Maximum number of training jobs each hyper parameter tuning job can run in parallel at once Each supported Region: 10 No Maximum number of training jobs each hyper parameter tuning job can run in parallel at once
As the limit is not adjustable, hence by raising a support case the limit can not be increased.
con risposta un anno fa
Contenuto pertinente
- AWS UFFICIALEAggiornata 2 anni fa
- AWS UFFICIALEAggiornata 9 mesi fa
I also get the same error that occurs when I try to run only 3 SM Pipelines at the same time, with just a single TrainingStep in each Pipeline. The training job succeeded, but the SM Pipeline fails. My training job is around 5-6 hours long, so unless this is resolved, I cannot rely on SM Pipeline to train the models, as it is very compute and time expensive to re-run the entire training step.