http.client.RemoteDisconnected at the end of a SageMaker training job

0

Hello. I'm running SageMaker training jobs through a library called ZenML. The library is just there as an abstraction layer, so that when I return the artifacts gets automatically saved to S3. The library works, no problem from that side, but when moving bigger files SageMaker fails to upload to S3.

In particular, after the long training is done, and I get charged the full price, I get:

ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) If needed I can provide the full log, but it fails at requests/adapters.py.

By looking it up online, it looks like the connection was actively terminated from the server, or that ther emight be networks error, even though artifacts do not leave AWS but they move from SageMaker to S3

没有答案

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则