AWS SageMaker - Extending Pre-built Container, Deploy Endpoint Failed. No such file or directory: 'serve'"

0

I am trying to deploy the SageMaker Inference Endpoint by extending the Pre-built image. However, it failed with "FileNotFoundError: [Errno 2] No such file or directory: 'serve'"

My Dockerfile

ARG REGION=us-west-2

# SageMaker PyTorch image
FROM 763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:1.12.1-gpu-py38-cu116-ubuntu20.04-ec2

RUN apt-get update

ENV PATH="/opt/ml/code:${PATH}"

# this environment variable is used by the SageMaker PyTorch container to determine our user code directory.
ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/code

# /opt/ml and all subdirectories are utilized by SageMaker, use the /code subdirectory to store your user code.
COPY inference.py /opt/ml/code/inference.py

# Defines inference.py as script entrypoint 
ENV SAGEMAKER_PROGRAM inference.py

CloudWatch Log From /aws/sagemaker/Endpoints/mytestEndpoint

2022-09-30T04:47:09.178-07:00
Traceback (most recent call last):
  File "/usr/local/bin/dockerd-entrypoint.py", line 20, in <module>
    subprocess.check_call(shlex.split(' '.join(sys.argv[1:])))
  File "/opt/conda/lib/python3.8/subprocess.py", line 359, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/opt/conda/lib/python3.8/subprocess.py", line 340, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/opt/conda/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/opt/conda/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
Traceback (most recent call last): File "/usr/local/bin/dockerd-entrypoint.py", line 20, in <module> subprocess.check_call(shlex.split(' '.join(sys.argv[1:]))) File "/opt/conda/lib/python3.8/subprocess.py", line 359, in check_call retcode = call(*popenargs, **kwargs) File "/opt/conda/lib/python3.8/subprocess.py", line 340, in call with Popen(*popenargs, **kwargs) as p: File "/opt/conda/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/opt/conda/lib/python3.8/subprocess.py", line 1704, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename)

2022-09-30T04:47:13.409-07:00
FileNotFoundError: [Errno 2] No such file or directory: 'serve'
profile picture
demandé il y a 2 ans676 vues
2 réponses
1

Hi, @holopekochan!

The serve script is installed by SageMaker PyTorch Inference Toolkit when you pip-install it in the Dockerfile.

However, it's hard to say why it's not found in your container. Are you sure that you use the inference container, not training container, for your endpoint? If you go to the AWS Console > Amazon SageMaker > Models > your model, what ECR image it shows in Container 1 - Image?

It will be useful if you can share the code that you used to setup the SageMaker PyTorch estimator (if any) how you define your PyTorchModel and how you deploy() it.

profile pictureAWS
Ivan
répondu il y a 2 ans
  • Thanks for your hint. I found the problem. I used wrong docker image, EC2 one instead of SageMaker. I simplify my question and provide the answer here. Just in case, if someone get the same problem.

  • Hi, @holopekochan! Good catch! I'm glad that you have found it!

0
Réponse acceptée

Should use the Sagemaker image

763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:1.12.1-gpu-py38-cu116-ubuntu20.04-sagemaker

instead of ec2

763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:1.12.1-gpu-py38-cu116-ubuntu20.04-ec2
profile picture
répondu il y a 2 ans

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions