Hi,
I'm encountering an issue with deploying a SageMaker endpoint that was previously working fine. I successfully deployed the Nous Hermes Llama 2 7B model to a g5.2xlarge endpoint about a week ago. It was functioning perfectly and responding to inference requests as expected. However, I deleted the endpoint for a week and now I attempted to deploy it again using the exact same configuration. Unfortunately, I'm now facing a problem where the endpoint fails to pass health checks.
I followed the same deployment steps as before, including using the same instance type and configuration settings. The only change that has occurred since the successful deployment is a recent update to the SageMaker Python library to version 2.175, which enabled the huggingface-llm 0.9.3 dlc images instead of the 0.8.2. I tried reverting to the previous version and it also did not work.
Has anyone else encountered a similar issue after a recent SageMaker library update?
Are there any new considerations or configurations required for deploying the model?
Is there a recommended approach to troubleshoot this issue and identify the cause?
I would greatly appreciate any advice, suggestions, or guidance you can provide. Thank you!