- Más nuevo
- Más votos
- Más comentarios
Hello, Thank you for reaching out. It is difficult to identify what the issue may be without further deep dive into job logs and understanding why the job may have failed with 'InternalServerError'.
Typically to extend a pre-built container in SageMaker, you need to declare the SAGEMAKER_SUBMIT_DIRECTORY and SAGEMAKER_PROGRAM environment variables. Please refer to the example Dockerfile here - https://github.com/aws/amazon-sagemaker-examples/blob/0efd885ef2a5c04929d10c5272681f4ca17dac17/advanced_functionality/pytorch_extend_container_train_deploy_bertopic/container/Dockerfile
You can also test the container image in local mode to check if its working as expected before deploying it to a job. If the issue persists, I would recommend reaching out to AWS on a support case along with the Job ARN and associated logs for further troubleshooting - https://console.aws.amazon.com/support/home?#/case/create
Contenido relevante
- preguntada hace 19 días
- preguntada hace 8 días
- Como solucionar el error: Supplied Policy document is breaching Cloudwatch Logs policy length limit.Respuesta aceptadapreguntada hace 5 días
- OFICIAL DE AWSActualizada hace 4 meses
- OFICIAL DE AWSActualizada hace 2 años
- OFICIAL DE AWSActualizada hace 2 años
- OFICIAL DE AWSActualizada hace 2 años