Sagemaker Autopilot Endpoint OutOfMemory

0

Good morning, I am encountering an unusual issue with Sagemaker Autopilot. After launching the training on my dataset and completing the entire process without any problems, the best model is deployed, and the endpoint appears to be functioning correctly. However, the issue arises when I attempt to call the endpoint repeatedly. I observe that the memory usage rapidly increases until it reaches 97%. At that point, it either crashes with the error message "worker died," and the CloudWatch logs indicate an "OutOfMemory" error in Java, or it remains operational but becomes congested. Upon reviewing CloudWatch, there doesn't appear to be anything abnormal except for the recurring message: "The column 'column_name' does not exist in your dataset. Please specify a different column name for the 'Input column' that actually exists in your dataset and try again." This message repeats even before I start invoking the endpoint. I'm uncertain if this is indeed the cause of the problem.

답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠