Unable to Pass Health Checks for Deployed SageMaker Endpoint since new SageMaker version

0

Hi,

I'm encountering an issue with deploying a SageMaker endpoint that was previously working fine. I successfully deployed the Nous Hermes Llama 2 7B model to a g5.2xlarge endpoint about a week ago. It was functioning perfectly and responding to inference requests as expected. However, I deleted the endpoint for a week and now I attempted to deploy it again using the exact same configuration. Unfortunately, I'm now facing a problem where the endpoint fails to pass health checks.

I followed the same deployment steps as before, including using the same instance type and configuration settings. The only change that has occurred since the successful deployment is a recent update to the SageMaker Python library to version 2.175, which enabled the huggingface-llm 0.9.3 dlc images instead of the 0.8.2. I tried reverting to the previous version and it also did not work.

Has anyone else encountered a similar issue after a recent SageMaker library update? Are there any new considerations or configurations required for deploying the model? Is there a recommended approach to troubleshoot this issue and identify the cause?

I would greatly appreciate any advice, suggestions, or guidance you can provide. Thank you!

Aron
질문됨 9달 전88회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠