1개 답변
- 최신
- 최다 투표
- 가장 많은 댓글
0
Please refer to the link below related to the Amazon SageMaker endpoints and quotas: https://docs.aws.amazon.com/general/latest/gr/sagemaker.html
Per the link - Maximum number of training jobs each hyper parameter tuning job can run in parallel at once Each supported Region: 10 No Maximum number of training jobs each hyper parameter tuning job can run in parallel at once
As the limit is not adjustable, hence by raising a support case the limit can not be increased.
답변함 일 년 전
관련 콘텐츠
- AWS 공식업데이트됨 2년 전
I also get the same error that occurs when I try to run only 3 SM Pipelines at the same time, with just a single TrainingStep in each Pipeline. The training job succeeded, but the SM Pipeline fails. My training job is around 5-6 hours long, so unless this is resolved, I cannot rely on SM Pipeline to train the models, as it is very compute and time expensive to re-run the entire training step.