How to get logs or print statements from SageMaker PyTorch deployed endpoint?


I've deployed an extended Pytorch model as an endpoint and I'm trying to make inference requests to it. Problem is, the responses from the endpoint get timed out and CloudWatch logs show nothing beyond:

1661544743589WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
1661544749569Model server started.

Now in my file, which I provided as the entry point I've set logging as follows:

import logging
import sys

logger = logging.getLogger(__name__)
logger.addHandler(logging.StreamHandler(sys.stdout))"Loading file.")
print("Loading file.")

I wish to see those logs/prints. How can I accomplish that?

2 Answers

I've come across similar issues in the past of log messages not making it through to CloudWatch, and can suggest setting environment variable PYTHONUNBUFFERED=1 (discussed further here on StackOverflow wrt containerized Python in general).

The procedure for this may vary a little depending how you're creating your model, endpoint config & endpoint (e.g. direct boto3/API calls, SageMaker SDK Estimator.deploy() or PyTorchModel). PyTorchModel should accept an env={"PYTHONUNBUFFERED": "1"} constructor argument for example.

If you are using the SageMaker Python SDK, do watch out that some methods (especially shortcuts like Estimator.deploy()) may re-use existing models & endpoint configs rather than re-creating each time they're run. Check you see the environment variable set in the SageMaker > Inference > Models > {Your Model Name} details page in AWS Console, and run DeleteModel first if needed to force an update!

answered 2 years ago


Please note that you can view the logs under CloudWatch logs. There would be a generated log stream under your pytorch-inference tab when it is inService (Navigate to inference tab under SageMaker console > Endpoints > click on the endpoint name > locate "view logs"). The previous steps will take you to CloudWatch logs console. You then click on log groups and locate /aws/sagemaker/Endpoints/pytorch-inference-YYYY-MM-DD-HH-MM-SS-sss > AllTraffic/i-instanceId. For example, for the code snippet you shared, If you add the lines of code below for logging purpose within the script):

import logging
import sys

logger = logging.getLogger(__name__)
logger.addHandler(logging.StreamHandler(sys.stdout))"Loading file.")
print("Loading file. --> from print statement")

# rest of the inference script from here

The above will show up under AllTraffic/instance-id (once the endpoint is inService) as the following:

2022-08-30 15:48:42,183 [INFO ] W-9000-model-stdout - Loading file.
2022-08-30 15:48:42,936 [INFO ] W-9000-model-stdout - Loading file. --> from print statement

From above, you can see that print statements will show up as INFO level log.

Please see the link [1] for more details on how the log stream is captured for inference jobs. If you have logging statements within your input_fn or say within your predict_fn function, those statements will show up when there is a prediction/scoring made. I hope the shared information is helpful.

Reference: [1]

please see @Alex_T answer for PYTHONUNBUFFERED=1 which forces print/logging to stdout. If you are bringing your own container you can define this parameter in the Dockerfile.

answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions