http.client.RemoteDisconnected at the end of a SageMaker training job

0

Hello. I'm running SageMaker training jobs through a library called ZenML. The library is just there as an abstraction layer, so that when I return the artifacts gets automatically saved to S3. The library works, no problem from that side, but when moving bigger files SageMaker fails to upload to S3.

In particular, after the long training is done, and I get charged the full price, I get:

ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) If needed I can provide the full log, but it fails at requests/adapters.py.

By looking it up online, it looks like the connection was actively terminated from the server, or that ther emight be networks error, even though artifacts do not leave AWS but they move from SageMaker to S3

No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions