InternalServerError

0

We have created a time-series models by autopilot on SageMaker. We have a problem with this model that when we start a Batch transform job with it the job is pending, then after ~20 minutes the job is marked as failed with "InternalServerError: We encountered an internal error. Please try again.". Attempting the job again didn't help. Is there any way to debug this? The models execution role, container, security group and subnets are set correctly. The containers are all in the same repo which the execution role has permission to access.

George
gefragt vor 7 Monaten234 Aufrufe
1 Antwort
0

Hi,

this guidance on troubleshooting such cases may help you finding (and solving) your issue: https://repost.aws/knowledge-center/sagemaker-http-500-internal-server-error

Best,

Didier

profile pictureAWS
EXPERTE
beantwortet vor 7 Monaten
  • HI Didier, Thank you for the suggestion. I tried this, but it did not work for me.

  • Hi, I would then suggest 2 additional things: check the CloudWatch logs to see if anything weird shows up in the logs and also check with CloudTrail which API calls are made and if which of them fails.

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen