By using AWS re:Post, you agree to the Terms of Use

How to debug invocation timeout in sagemaker?


I am testing inference in sagemaker , by using one of the container listed here -> the model is zipped up as below and with in file , i am overwriting functions like model_fn method and predict_fn. I tested this with batch transform and it worked but for few small input files but for other larger files, i keep getting "Model server did not respond to /invocations request within 3600 seconds" . I'm trying to find out what is the cause of it? 3600 is the max we can set for "invocation timeout in seconds" parameter and the default input size for batch is 6mb , the input files i'm using are way smaller than that but i still get that error.

Directory structure

|- model.pth
|- code/
  |- requirements.txt  

file :

import torch
import os

def model_fn(model_dir):
    model = Your_Model()
    with open(os.path.join(model_dir, 'model.pth'), 'rb') as f:
    return model

def predict_fn():

based on docs here,, do we need to install flask and have an /invocations endpoint , that responds 200 ok , when we are using custom container?

1 Answer

One of the best ways to debug a custom inference script would be to start off with using the SageMaker "local mode". Once you are sure that your script is working fine, move over to hosting on the SageMaker endpoint. Here are some of the examples to get started.

Example for a TF serving model that I have a custom Inference script, I would use local mode as shown below for my testing-

from sagemaker.tensorflow.model import TensorFlowModel
from sagemaker.local import LocalSession

tensorflow_serving_model = TensorFlowModel(
  # sagemaker_session=sagemaker_session,
answered 5 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions